基因数据处理48之ART使用实例
发布时间:2021-03-12 02:45:13  所属栏目:大数据  来源:网络整理 
            导读:副标题#e# 相关参数请见上一篇 1.使用实例1: hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -f 20 -o G38L100F20Nhs20 ====================ART==================== ART_Illumina
                
                
                
            | 副标题[/!--empirenews.page--] 
 相关参数请见上一篇 1.使用实例1: hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -f 20 -o G38L100F20Nhs20
    ====================ART====================
             ART_Illumina (2008-2016)          
          Q Version 2.5.1 (Apr 17,2016)       
     Contact: Weichun Huang <whduke@gmail.com> 
    -------------------------------------------
                  Single-end Simulation
Total CPU time used: 1162.71
The random seed for the run: 1464879720
Parameters used during run
    Read Length:    100
    Genome masking 'N' cutoff frequency:    1 in 100
    Fold Coverage:            20X
    Profile Type:             Combined
    ID Tag:                   
Quality Profile(s)
    First Read:   HiSeq 2000 Length 100 R1 (built-in profile) 
Output files
  FASTQ Sequence File:
    G38L100F20Nhs20.fq
  ALN Alignment File:
    G38L100F20Nhs20.aln2.使用实例2: hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS25 -sam -i GRCH38chr1L3556522.fna -p -l 150 -f 20 -m 200 -s 10 -o paired_dat
    ====================ART====================
             ART_Illumina (2008-2016)          
          Q Version 2.5.1 (Apr 17,2016)       
     Contact: Weichun Huang <whduke@gmail.com> 
    -------------------------------------------
                  Paired-end sequencing simulation
Total CPU time used: 1070.33
The random seed for the run: 1464880583
Parameters used during run
    Read Length:    150
    Genome masking 'N' cutoff frequency:    1 in 150
    Fold Coverage:            20X
    Mean Fragment Length:     200
    Standard Deviation:       10
    Profile Type:             Combined
    ID Tag:                   
Quality Profile(s)
    First Read:   HiSeq 2500 Length 150 R1 (built-in profile) 
    First Read:   HiSeq 2500 Length 150 R2 (built-in profile) 
Output files
  FASTQ Sequence Files:
     the 1st reads: paired_dat1.fq
     the 2nd reads: paired_dat2.fq
  ALN Alignment Files:
     the 1st reads: paired_dat1.aln
     the 2nd reads: paired_dat2.aln
  SAM Alignment File:
    paired_dat.sam查看文件: hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ ll -h total 50G drwxrwxr-x 2 hadoop hadoop 4.0K 6月 2 23:16 ./ drwxrwxr-x 6 hadoop hadoop 4.0K 6月 2 22:59 ../ -rw-rw-r-- 1 hadoop hadoop 11G 6月 2 23:29 G38L100F20Nhs20.aln -rw-rw-r-- 1 hadoop hadoop 9.4G 6月 2 23:29 G38L100F20Nhs20.fq -rw-r--r-- 1 hadoop hadoop 241M 6月 2 23:00 GRCH38chr1L3556522.fna -rw-rw-r-- 1 hadoop hadoop 2.5K 6月 2 23:09 GRCH38chr1L3556522.fna.amb -rw-rw-r-- 1 hadoop hadoop 144 6月 2 23:09 GRCH38chr1L3556522.fna.ann -rw-rw-r-- 1 hadoop hadoop 238M 6月 2 23:09 GRCH38chr1L3556522.fna.bwt -rw-rw-r-- 1 hadoop hadoop 60M 6月 2 23:09 GRCH38chr1L3556522.fna.pac -rw-rw-r-- 1 hadoop hadoop 119M 6月 2 23:10 GRCH38chr1L3556522.fna.sa -rw-rw-r-- 1 hadoop hadoop 4.9G 6月 2 23:42 paired_dat1.aln -rw-rw-r-- 1 hadoop hadoop 4.6G 6月 2 23:42 paired_dat1.fq -rw-rw-r-- 1 hadoop hadoop 4.8G 6月 2 23:42 paired_dat2.aln -rw-rw-r-- 1 hadoop hadoop 4.6G 6月 2 23:42 paired_dat2.fq -rw-rw-r-- 1 hadoop hadoop 11G 6月 2 23:42 paired_dat.sam 生成文件都好大 3.制定每条序列产生的reads数: (产生的数据变小了) hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -c 50 -o G38L100c50Nhs20
    ====================ART====================
             ART_Illumina (2008-2016)          
          Q Version 2.5.1 (Apr 17,2016)       
     Contact: Weichun Huang <whduke@gmail.com> 
    -------------------------------------------
                  Single-end Simulation
Total CPU time used: 15.96
The random seed for the run: 1464918709
Parameters used during run
    Read Length:    100
    Genome masking 'N' cutoff frequency:    1 in 100
    Fold Coverage:            0X
    Profile Type:             Combined
    ID Tag:                   
Quality Profile(s)
    First Read:   HiSeq 2000 Length 100 R1 (built-in profile) 
Output files
  FASTQ Sequence File:
    G38L100c50Nhs20.fq
  ALN Alignment File:
    G38L100c50Nhs20.aln
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ ls
G38L100c50Nhs20.aln  G38L100F20Nhs20.aln  GRCH38chr1L3556522.fna      GRCH38chr1L3556522.fna.ann  GRCH38chr1L3556522.fna.pac  paired_dat1.aln  paired_dat2.aln  paired_dat.sam
G38L100c50Nhs20.fq   G38L100F20Nhs20.fq   GRCH38chr1L3556522.fna.amb  GRCH38chr1L3556522.fna.bwt  GRCH38chr1L3556522.fna.sa   paired_dat1.fq   paired_dat2.fq
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ ll
total 51506772
drwxrwxr-x 2 hadoop hadoop        4096  6月  3 09:51 ./
drwxrwxr-x 6 hadoop hadoop        4096  6月  2 22:59 ../
-rw-rw-r-- 1 hadoop hadoop       11400  6月  3 09:52 G38L100c50Nhs20.aln
-rw-rw-r-- 1 hadoop hadoop       10428  6月  3 09:52 G38L100c50Nhs20.fq(编辑:源码网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! | 

