Source: https://github.com/li081766/shinyWGD/blob/main/vignettes/intro_to_shinywgd.Rmd

1 Getting Started

This vignette provides an introduction to WGD inference using the shinyWGD server or package.

shinyWGD can prepare the input and command lines for wgd, ksrates, i-ADHoRe, OrthoFinder, and Whale and then study the whole genome duplication events (WGDs) of the interesting species.

This page provides a brief introduction to this server.

2 Installation

2.1 Install the latest development version from GitHub

if (!require("devtools")) {
  install.packages("devtools")
}
devtools::install_github("li081766/shinyWGD", dependencies=TRUE, build_vignettes=TRUE)

3 Usage

library(shinyWGD)
runshinyWGD()
help(package="shinyWGD")

3.1 Data Preparation

3.1.1 Preparing the input for `wgd`, `ksrates`, `i-ADHoRe` and `OrthoFinder`

Please select the proper number of species you want to study in Select Number of Species to Analyze: first.

💡 Note

If only one species is set, the core program will be set to wgd, otherwise, ksrates will be chosen.

shinyWGD provides two options to upload the data in the Data Preparation Page.

3.1.1.1 Option One

Users can prepare a Tab-Separated file in which each row includes the data of a species.

This option is suitable to study multiple species.

The detailed format of this file is below:

Tab-Separated file

latin name	cds file name^a	gff file name^b (optional)
Elaeis guineensis	elaeis.fasta	elaeis.gff
Oryza sativa	aryza.fasta	oryza.gff3
Asparagus officinalis	asparagus.fasta.gz

^a cds is the coding region of a gene or the portion of a gene’s DNA or RNA that codes for protein. This file can be fasta or gzipped-fasta.
This column is mandatory.
^b gff is the format consists of one line per feature, each containing nine columns of data, plus optional track definition lines, see more click here.
The file can be gff, gff3, gzipped-gff, or gzipped-gff3.
This column is optional.
focal species is the parameter in ksrates, which will be set in the configure file of ksrates.

Warning

Do not contain the alternative isoforms of each gene.
Do not contain the Header Row when preparing the table.

💡 Note

To ensure the successful transformation of of annotation file, please review the type (the third column) and attributes (the ninth column) of the GFF file and make sure that when the type is equal to mRNA, the attributes column starts with ID=.
Download the example GFF file of Arabidopsis thaliana from PLAZA.

Now users need to upload the corresponding files in the Upload CDS / Annotation Files: field.

⚠️ Ensure that the uploaded files match the definitions in the Tab-Separated file.

After successfully uploading the tab-separated file and the CDS / Annotation files, the number of species in the file will be calculated.

If only one species needed to be studied, shinyWGD will generate the command line for wgd by clicking the Create wgd Codes in the right Setting panel.
- Users can check the codes by clicking the Go to wgd Scripts button.
- Then, users can click the Run wgd button to operate wgd program in PSB computing cluster in Script Executing panel. After the results are generated, an email will be sent to the provided email address.
- ⚠️ The Script Executing panel only works on the server version of shinyWGD of PSB.
- Then, users can click the Download Analysis Data button to download the prepared files and script for wgd program.
If multiple species needed to be study, shinyWGD will generate the command line for ksrates.
- Two more parameters should be added.
  - focal species set the focal species for ksrates.
  - newick tree upload a tree file in newick format. The tree should contain all the species.
- Then, users can click the Create ksrates Codes button to generate the configure file and command line for ksrates.
- After that, users can click the Go to ksrates Scripts button to review the configure file and command line for ksrates.
- Then, users can click the rest buttons to generate the scripts for i-ADHoRe and OrthoFinder.
- Then, users can submit the computing to the PSB computing cluster. This also only works on the server version of shinyWGD of PSB.
- Finally, users can click the Download Analysis Data button to download the prepared files and script for ksrates, i-ADHoRe, and OrthoFinder program.

3.1.1.2 Option Two

This option provide users a more straightforward way to upload data.

After selecting the number of studied species, the corresponding uploading panels will be created.
Three items (latin name, cds fasta, and annotation gff) of each species should be set or uploaded by users.
Once more than one species needed to be analyzed, the focal species and newick tree should be defined by users.
Then users can click the Create wgd Codes or Create ksrates Codes button to create the codes for wgd or ksrates according to the species number.
After reviewing and downloading the necessary files, users can execute wgd, ksrates, i-ADHoRe or OrthoFinder in their local environment or submit the computing to the PSB computing cluster.

Warning

shinyWGD will exclude genes with internal stop codons and genes whose lengths are not divisible by three. You can find details on the gene filtering process in the Sequence_processing.log file located in the Analysis_Date folder.

3.1.2 Executing `wgd`, `krates`, `i-ADHoRe` or `OrthoFinder` locally

After decompressing the downloaded compressed file (Analysis_Date.tgz), users will find the bash and configure files (run_wgd.sh / run_ksrates.sh, run_paralog_ks_rest_species.sh, run_diamond_iadhore.sh, run_orthofinder.sh).

💡 Note

In the first several lines of each run_*.sh file, we include the #SBATCH directive to specify the memory and thread usage. This is essential because our cluster relies on SLURM for workload management. Users can modify these lines to align with their specific cluster configurations.
Our cluster also uses Environment Modules to manage software. In the script files, please comment out the module load commands for packages if you are not using Environment Modules.

Users can execute them in their local environment.

How to install wgd to your local environment, please read the wgd document.
How to install ksrate to your local environment, please read the ksrates document.
How to install i-ADHoRe to your local environment, please read the i-ADHoRe document.
How to install OrthoFinder to your local environment, please read the OrthoFinder document.

💡 Note

When using i-ADHoRe to do synteny analysis, the alignment will be produced firstly. The alignment process will use the diamond program. So please install diamond program first.

💡 Note

To compute the K_s value of each anchored gene pair and the K_s unit tree of studied species, the MAFFT, trimAl, and PAML will be used for aligning the proteins, obtaining the nucleotide alignment and calculating the K_s value, respectively. So please install MAFFT, trimAl, and PAML first.

3.2 Whale Preparation

In this page, users can use the output of OrthoFinder to prepare the input for Whale, a sophisticated package for the gene tree - species tree reconciliation analysis.

Users need to upload two inputs to operate Whale.jl. One is the species time tree in Newick format. Another is the folder which includes all ALE files. If users use OrthoFinder in shinyWGD, users can find the folder in Orthofinder_wd, named OrthoFinderOutput_for_Whale.tar.gz. Users can also follow the instruction of Whale.jl to prepare the ALE files by using the gene families built by other software.

💡 Note

To run the run_Whale_preparing.sh to prepare the ALE inputs for Whale, the MrBayes and ALE will be operated to transform each gene family into its corresponding ALE file. How to install these two software, please refer to their websites.

Once the tree file uploaded, users can check the topology of the tree in the right panel. Then users can add the hypothetical WGD event to a certain branch by clicking the branch.

Warning

Users should use wgd as the prefix for the name of hypothetical WGD event to satisfy the requirements of Whale.jl.

In the above tree figure, four hypothetical WGD events are tested, named as wgd1 and wgd2, respectively.

Users can review the added hypothetical WGD event in the Hypothetical WGDs to test: panel in the left side.

Users can define the Base Model in the Base Model for Whale panel and set the Chain rounds in the Set the Chain for Whale panel.

After setting all the parameters, users can create Whale scripts by clicking the Create Whale Codes button. Then, users can review the scripts by clicking the Go to Whale Scripts button

3.3 TimeTreeFetcher

If users don’t ensure the evolutionary relationships among the studied species, shinyWGD provides an alternative way to generate the tree through interacting with TimeTree.org database.

Warning

TimeTree database does not include all the species. If the species are not included in the TimeTree database, you should ensure the relationship in other ways.

Please click the Tree Extraction block in the Scripts to extract a tree from TimeTree.org database.

A file containing species names in each row needs to be uploaded in the Upload Species List File: field.

Species names file

Vitis vinifera

Elaeis guineensis

Oryza sativa

Asparagus officinalis

After successfully obtaining the tree, users can review the tree in the right panel.

In the example, eight species were used to extract the tree from TimeTree.org. (A) The uploading panel allows users to upload the species information and start the extraction. (B) The tree display panel enables users to choose a proper strategy to build the tree and download the tree in two formats (Newick and Nexus format). When users choose As timetree.org, the module will create the tree as TimeTree.org do. In TimeTree.org, they use the adjusted time to build the species divergence time tree. However, the evolutionary close species with the same adjusted divergence time led to more than two branches sharing the same ancestral node in the tree. When users choose the median time strategy, the conflict will be corrected. (C) The species information panel. The species with identical time records will be assigned to the same group. In this example, Apostasia fujianica, Apostasia shenzhenica, and Neuwiedia singapureana share the same time record in TimeTree.org. Platanthera guangdongensis and Platanthera zijinensis also have identical records.

3.4 K_sDist

After successfully running the corresponding shell files, the necessary outputs will be saved in the Analysis_Date folder. Then users can choose the Ks_Data_for_Visualization.tar.gz file in the folder and upload it to the K_sDist page.

Warning

Users should choose the compressed file which is originated from the folder created by shinyWGD program to avoid the failure of displaying the correct file names.

3.4.1 Paralogous K_s age distribution

Once selecting one species from the drop-down menu of Paralogous K_s ▼, users can click the Confirm analysis to create the output panel in the right side of the page.

After clicking the Start , the paralogous K_s age distribution of the selected species will be generated.

shinyWGD employs two methods to predict the potential peaks of the distribution.

First, shinyWGD fits normal mixture models to the K_s data using maximum likelihood with EMMIX. shinyWGD will test one to ten components with 500 random starts. Fitted models are selected by the Bayesian Information Criterion (BIC) within EMMIX. Finally, the fitted models can be drawn above the bar plot of each species. Users can choose one or several peaks to draw the curve.

Second, shinyWGD uses SiZer to identify significant peaks based on the significant zero crossings of derivatives from kernel density estimation. Then, a subplot to show the output of SiZer will be added under the bar plot.

📚 Reference

Please refer to the papers below to see details about EMMIX and SiZer.

McLachlan G, Peel D. (1999) The EMMIX algorithm for the fitting of normal and t-components. Journal of Statistical Software. 4:1–14.

Barker, M.S., Kane, N.C., Matvienko, M., Kozik, A., Michelmore, R.W., Knapp, S.J. and Rieseberg, L.H., (2008) Multiple paleopolyploidizations during the evolution of the Compositae reveal parallel patterns of duplicate gene retention after millions of years. Molecular biology and evolution, 25(11), pp.2445-2455.

Chaudhuri, P. and Marron, J.S., (1999) SiZer for exploration of structures in curves. Journal of the American Statistical Association, 94(447), pp.807-823.

On the right side of the plot, a table with GMM peaks information is displayed. Users can choose which peaks to be displayed by clicking the check box of each row.

The example output is below.

3.4.2 Orthologous K_s age distribution

Users need to select the species from the Select Species to compare of the Orthologous K_s ▼ to compute the density of K_s age distribution among the selected species.

To obtain the potential peak of the plot, shinyWGD will fit the peak by using the Epanechnikov kernel function. Then 95% confidence interval (CI) is estimated by 1000 iterations.

On the right side of the density plot, an ultrametric tree is displayed with circles corresponding to the density plot. Users can remove comparisons of less interest by clicking the circle in the tree. A pop-up menu will appear, allowing users to delete the selected comparison.

The example output is below.

In the above figure, the dash line is the peak of a combination of the reference species and another species. The colored rectangle is the 95% CI. Users can retrieve the detail information about the density, the peaks, and the 95% CIs by hovering the mouse above the lines or rectangles.

3.4.3 Substitution rate correction

When comparing the one-to-one orthologous K_s distributions between studied species and the reference species, the different K_s peaks for the same speciation event may exist, suggesting different synonymous substitution rates among the studied species, resulting in an over- or under-estimate for the divergence between the studied species and the reference species.

To quantify the difference in substitution rates, shinyWGD employs the relative rate tests. First, the Newick tree, which contains the phylogenetic relationship, will be used to ensure enough species. Then, users need to define a reference species and an outgroup species to determine a tree structure among the species which will be used in the relative rate test. Then, the K_s distance between two species is estimated by the mode of their orthologous K_s distribution. Using the K_s distance between the reference species and the studied species, the K_s distance between the outgroup species and the reference species, the K_s distance between the outgroup species and the studied species, the distances to the studied species and the reference species after their divergence are computed. Finally, orthologous K_s between the studied species and the reference species is corrected by the double of the K_s distance to the reference species, assuming that studied species has the same substitution rate as the reference species.

The example output is below.

In the left figure, Kernel-density estimates (KDEs) of K_s distribution for one-to-one orthologues between A. officinalis and other three species, namely A. shenzhenica, D. catenatum and P. equestris. As the modes (peaks) of the KDE all represent the distances between A. officinalis and the compared species, the differences observed among the K_s values of the modes indicate substitution rate variations among the studied species. The segments with arrows above the KDE denote, from top to bottom, the K_s distances computed for P. equestris and D. catenatum using A. shenzhenica as the reference and A. officinalis as the outgroup. The dotted lines align with the observed modes in the KDE of the orthologous K_s distributions. The black dots on the segments represent the divergence between A. shenzhenica with D. catenatum and between A. shenzhenica with P. equestris; hence, the lengths of segments pointed to the right show the K_s accumulated in A. shenzhenica and D. catenatum / P. equestris, whereas the grey segments pointed to the left show the K_s distance between A. officinalis and the divergence of A. shenzhenica. The grey rectangles show the 95% confidence intervals of the K_s inferred from 200 bootstraps.

In the right figure, shinyWGD uses arrows with directions to indicate overestimation (to the right) or underestimation (to the left) of divergence events. The arrow points to the K_s values after correcting for different substitution rates. The bar and line plots are the paralogous K_s distributions of A. shenzhenica.

3.4.4 Phylo-K_s analysis

shinyWGD uses the single-copy gene families built by OrthoFinder to calculate the K_s unit tree. Users can check the tree in the Orthofinder_wd/ds_tree_wd folder, named singleCopyGene.ds_tree.newick. Once uploading the tree file in the K_s Tree File panel, the K_s unit tree will displayed in the right side of the page and a ultrametric tree will automatically added in the right right side of the K_s unit tree. Users can click the branch in the ultrametric tree to add the WGD events into the branch by observing the left K_s unit tree.

Users can click the text of the species name to change the color of the text.

Users can upload a file which contains the K_s peaks of each studied species to place these peaks into the K_s unit tree. This file can be created by operating Paralogous K_s ▼ module.

Or, users can prepare a file blow.

K_s Peaks file separated by Tab

Species	Peak in	Peak	95% Confidence Interval
Apostasia shenzhenica	Anchor pairs	0.97	0.55-2.19
Dendrobium catenatum	Anchor pairs	0.86	0.38-4.18
Phalaenopsis equestris	Anchor pairs	0.96	0.58-1.92

The example output is below.

3.5 Collinear Analysis

Users can choose the Collinear_Data_for_Visualization.tar.gz file in the Analysis_Date folder and upload it to the Collinear Analysis page. Then, users can study the intra-, inter-, and multiple- species synteny, as well as the clustering analysis between two species.

3.5.1 Intra- or Inter-species alignment

After selecting the studied species, users can click the Confirm analysis to create the setting and output panel in the right side of the page.

Users can set the threshold for the number of anchor pairs in each multiplicon to filter out the small multiplicons which contain less anchor pairs.

The dot plot and parallel link plot will be generated.

3.5.2 Multiplicon-level synteny

Users can also search a certain gene to check the location of the gene in the multiplicon in the Multiplicon-level Synteny panel and then plot the searched multiplicon.

3.5.3 Multiple species alignment

In the multiple species alignment panel, users can define the order of species in the plot. Users can change the color of species by clicking the text of the species name. When the mouse hovers a link, the related links with this link among species can be highlighted using different colors.

3.5.4 Clustering analysis

shinyWGD employs the Hierarchical clustering method for constructing putative ancestral regions (PARs). The segments are identified from the outputs of i-ADHoRe, and the homolog concentration scores are calculated using -log(p), where p is the probability of the observed number of homolog pairs as modeled by a Poisson distribution.

📚 Reference

To read more about homolog concentration scores, please refer to Putnam NH, et al. (2008) The amphioxus genome and the evolution of the chordate karyotype. Nature 453:1064–1071.

For each segment, the array of scores against all segments in the other genome forms a unique profile. The segments are then clustered based on the similarity of these profiles (determined by Pearson correlation coefficient r) using the average linkage method. The default cutoff of r is 0.3. Users can change this cutoff through the sliding block. The faint yellow rectangle will highlight the significantly enriched regions (P < 1E-10). Then, users can zoom in each PAR through the Putative Ancestral Regions block.

In the example, we used 0.5 as the cutoff for r and 10 to filter out the segment with fewer anchor points. In total, we identified 61 PARs.

A single PAR zooms in is below.

3.6 TreeRecon

In this page, users can use Whale to operate the gene tree - species tree reconciliation analysis.

After Whale is done, users can check the output in this page.

The example output is below.

We can judge whether the hypothetical WGD event is true or not based on the Bayes factor (K) to compare the likelihood of q = 0(H₀) to the likelihood of q > 0(H₁) using the Savage–Dickey density ratio.

Warning

Please Note: the log10 Bayes factor is calculated. Thus, a Bayes factor smaller than -2 could be considered as evidence in favor of the q ≠ 0 model compared to the q = 0 mode.

Besides, users should also check the ESS values. Generally, ESS values exceeding at least 100 is better for a good model, although short chains may be good for exploring and testing different models. If the ESS values below 100, please increase the Chain Rounds in the left setting panel.

Then, users can click the button to update the species tree to display the output of Whale.

The example output is below.

In the above plot, the WGD with the solid green bar is supported with retention rates significantly different from zero, while the hollow WGD bars are the ones with retention rates not different from zero.

Then users can download the plot by clicking the button.

3.7 VizWGD

In this page, users can visualize the time tree of MCMCTree, or a ultrametric tree.

3.7.1 Time tree

Users can upload a nexus tree withe the divergence time information. The format of the tree is the output file of MCMCTree, named FigTree.tre. Be careful with the time unit, please use the 100 million yeas as the time scale. Users can click the text of the species name to change the color or add a symbol to the species. Users can click the branch to add a WGD into the branch.

Users can upload a file which contains the WGD events of each studied species to place these WGDs into the tree.

WGD Events file separated by Tab

Species	WGD events^a	color
Elaeis guineensis	0.75	#ababab
Oryza_sativa	0.6308-0.6989,1.10-1.17	#ababab
Phalaenopsis_equestris	0.4565-0.7838	#ababab
Zostera_marina	0.64-0.72	#ababab
Spirodela polyrhiza	0.9079,0.8964	#ababab
Asparagus officinalis	0.3245-0.7013	#ababab
Ananas comosus	1.10-1.17,1.17-1.33	#ababab
Vitis vinifera	1.17-1.33	#ababab

^a WGD events can be a single time point or a time interval. Please use “,” to separate multiple WGD events within a species.

The example output is below.

3.7.2 Ultrametric tree

Users can also upload a ultrametric tree and then add the WGD events by clicking the branches manually.

The example output is below.

4 Structure

Home
Scripts
- Data Preparation
- Whale Preparation
- TimeTreeFetcher
Analysis
- K_sDist
- Collinear
- VizWGD
- TreeRecon
Gallery
Help

5 Dependencies

External Software
- wgd
  - Default settings
- ksrates
  - Default settings
- i-ADHoRe
  - anchor_points: 3
  - gap_size: 35
  - alignment_method: gg2
  - cluster_gap: 40
  - q_value: 0.75
  - visualizeGHM: false
  - visualizeAlignment: false
- Whale
  - parameter: Default settings
- OrthoFinder
  - Default settings
- diamond
  - Default settings
- MAFFT
  - Default settings
- trimAl
  - automated1
  - ignorestopcodon
- PAML
  - Default settings
- MrBayes
  - Default settings
- ALE
  - Default settings
R packages
- {shiny}
- {shinyjs}
- {shinyFiles}
- {shinyBS}
- {shinyWidgets}
- {shinyalert}
- {bslib}
- {bsplus}
- {htmltools}
- {stringi}
- {tidyverse}
- {vroom}
- {english}
- {data.table}
- {argparse}
- {dplyr}
- {tools}
- {seqinr}
- {DT}

6 References

Zwaenepoel, A., and Van de Peer, Y., (2019) wgd - simple command line tools for the analysis of ancient whole genome duplications. Bioinformatics, bty915.
Sensalari, C., Maere, S., and Lohaus R., (2021) ksrates: positioning whole-genome duplications relative to speciation events in KS distributions. Bioinformatics, btab602.
Proost, S., Fostier, J., De Witte, D., Dhoedt, B., Demeester, P., Van de Peer, Y. and Vandepoele, K., (2012) i-ADHoRe 3.0—fast and sensitive detection of genomic homology in extremely large data sets. Nucleic acids research, 40(2), pp.e11-e11.
Zwaenepoel, A., and Van de Peer, Y., (2019) Inference of ancient whole-genome duplications and the evolution of gene duplication and loss rates. Molecular biology and evolution, 36(7), 1384-1404.
Emms, D.M., and Kelly, S., (2015) OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome biology, 16(1), pp.1-14.

7 Session info

sessionInfo()
#> R version 4.2.1 (2022-06-23)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur ... 10.16
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] fontawesome_0.5.2 shinyWGD_1.0.5   
#> 
#> loaded via a namespace (and not attached):
#>   [1] nlme_3.1-157       fs_1.5.2           usethis_2.1.6     
#>   [4] lubridate_1.9.2    devtools_2.4.5     bit64_4.0.5       
#>   [7] httr_1.4.5         rprojroot_2.0.3    tools_4.2.1       
#>  [10] profvis_0.3.7      bslib_0.4.2        utf8_1.2.2        
#>  [13] R6_2.5.1           DT_0.27            KernSmooth_2.23-20
#>  [16] ade4_1.7-22        urlchecker_1.0.1   withr_2.5.0       
#>  [19] tidyselect_1.2.0   prettyunits_1.1.1  processx_3.8.0    
#>  [22] bit_4.0.4          compiler_4.2.1     english_1.2-6     
#>  [25] cli_3.6.0          argparse_2.2.2     xml2_1.3.3        
#>  [28] shinyjs_2.1.0      desc_1.4.2         sass_0.4.1        
#>  [31] mvtnorm_1.1-3      callr_3.7.3        commonmark_1.8.0  
#>  [34] bsplus_0.1.4       stringr_1.5.0      digest_0.6.29     
#>  [37] shinyBS_0.61.1     rmarkdown_2.21.1   pkgconfig_2.0.3   
#>  [40] htmltools_0.5.4    sessioninfo_1.2.2  fastmap_1.1.1     
#>  [43] htmlwidgets_1.6.1  rlang_1.0.6        rstudioapi_0.14   
#>  [46] shiny_1.7.5.1      jquerylib_0.1.4    generics_0.1.3    
#>  [49] jsonlite_1.8.4     mclust_6.0.0       vroom_1.6.1       
#>  [52] shinyalert_3.0.0   dplyr_1.1.0        magrittr_2.0.3    
#>  [55] Matrix_1.5-3       Rcpp_1.0.10        fansi_1.0.3       
#>  [58] ape_5.7-1          lifecycle_1.0.3    yaml_2.3.5        
#>  [61] stringi_1.7.6      brio_1.1.3         MASS_7.3-57       
#>  [64] pkgbuild_1.4.0     grid_4.2.1         shinyFiles_0.9.3  
#>  [67] parallel_4.2.1     promises_1.2.0.1   crayon_1.5.1      
#>  [70] miniUI_0.1.1.1     lattice_0.20-45    knitr_1.42        
#>  [73] ps_1.7.2           pillar_1.8.1       igraph_1.4.1      
#>  [76] seqinr_4.2-30      markdown_1.5       pkgload_1.3.2     
#>  [79] glue_1.6.2         evaluate_0.20      data.table_1.14.8 
#>  [82] remotes_2.4.2      vctrs_0.5.2        tzdb_0.3.0        
#>  [85] httpuv_1.6.5       testthat_3.1.6     purrr_1.0.1       
#>  [88] tidyr_1.3.0        ks_1.14.0          cachem_1.0.6      
#>  [91] xfun_0.37          mime_0.12          xtable_1.8-4      
#>  [94] tidyverse_2.0.0    pracma_2.4.2       roxygen2_7.2.3    
#>  [97] later_1.3.0        tibble_3.1.8       memoise_2.0.1     
#> [100] shinyWidgets_0.7.6 timechange_0.2.0   ellipsis_0.3.2

An Introduction to `shinyWGD`

Jia Li

2023-12-13

1 Getting Started

2 Installation

2.1 Install the latest development version from GitHub

3 Usage

3.1 Data Preparation

3.1.1 Preparing the input for `wgd`, `ksrates`, `i-ADHoRe` and `OrthoFinder`

3.1.1.1 Option One

3.1.1.2 Option Two

3.1.2 Executing `wgd`, `krates`, `i-ADHoRe` or `OrthoFinder` locally

3.2 Whale Preparation

3.3 TimeTreeFetcher

3.4 K_sDist

3.4.1 Paralogous K_s age distribution

3.4.2 Orthologous K_s age distribution

3.4.3 Substitution rate correction

3.4.4 Phylo-K_s analysis

3.5 Collinear Analysis

3.5.1 Intra- or Inter-species alignment

3.5.2 Multiplicon-level synteny

3.5.3 Multiple species alignment

3.5.4 Clustering analysis

3.6 TreeRecon

3.7 VizWGD

3.7.1 Time tree

3.7.2 Ultrametric tree

4 Structure

5 Dependencies

External Software

R packages

6 References

7 Session info

An Introduction to shinyWGD

Jia Li

2023-12-13

1 Getting Started

2 Installation

2.1 Install the latest development version from GitHub

3 Usage

3.1 Data Preparation

3.1.1 Preparing the input for wgd, ksrates, i-ADHoRe and OrthoFinder

3.1.1.1 Option One

3.1.1.2 Option Two

3.1.2 Executing wgd, krates, i-ADHoRe or OrthoFinder locally

3.2 Whale Preparation

3.3 TimeTreeFetcher

3.4 KsDist

3.4.1 Paralogous Ks age distribution

3.4.2 Orthologous Ks age distribution

3.4.3 Substitution rate correction

3.4.4 Phylo-Ks analysis

3.5 Collinear Analysis

3.5.1 Intra- or Inter-species alignment

3.5.2 Multiplicon-level synteny

3.5.3 Multiple species alignment

3.5.4 Clustering analysis

3.6 TreeRecon

3.7 VizWGD

3.7.1 Time tree

3.7.2 Ultrametric tree

4 Structure

5 Dependencies

External Software

R packages

6 References

7 Session info

An Introduction to `shinyWGD`

3.1.1 Preparing the input for `wgd`, `ksrates`, `i-ADHoRe` and `OrthoFinder`

3.1.2 Executing `wgd`, `krates`, `i-ADHoRe` or `OrthoFinder` locally

3.4 K_sDist

3.4.1 Paralogous K_s age distribution

3.4.2 Orthologous K_s age distribution

3.4.4 Phylo-K_s analysis