原始数据文件形式@A00228:279:HFWFVDMXX:1:1101:8486:10001:N:0:NCATTACTNCATTACT3#FFFFFFF@A00228:279:HFWFVDMXX:1:1101:10782:10001:N:0:NCATTACTNCATTACT+#FFFFFFFpbmc_1k_v3_fastqs@A00228:279:HFWFVDMXX:1:1101:8486:10001:N:0:NCATTACTpbmc_1k_v3_S1_L001_11_001.fastq.gzNGTGATTAGCTGTACTCGTATGTAAGGTpbmc_1k_v3_S1_L001_R1_001.fastq.gz+pbmc_1k_v3_51_L001_R2_001.fastq.gz#FFFFFFFFFFFFFFFFFFFFFFFFFFFpbnc_1k_v3_S1_L002_11_001.fastq-gz@A00228:279:HFWFVDMXX:1:1101:10782:10001:N:0:NCATTACTpbmc_1k_v3_51_L02_R1_001.fastq.gzNTCATGAAGTTTGGCTAGTTATGTTCATpbmc_1k_v3_51_L002_R2_001.fastq.gz+#FFFFFFFFFFFFFFFFFFFFFFFFFFF@A00228:279:HFWFVDMXX:1:1101:8486:10002:N:0:NCATTACTNACAAAGTCCCCCCCATAATACAGGGGGAGCCACTTGGGCAGGAGGCAGGGAGGGGTCCATTCCCCCTGGTGGGGCTGGTGGGGAGCTGTAx#FFFFFFFFFFFFFFF:FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFF@A00228:279:HFWFVDMXX:1:1101:10782:10002:N:0:NCATTACTNTTGCAGCTGAACTGGTAAACTTGTCCCTAAAGAGACATAAGAATGGTCAACTGGAATGTGGATTCATCTGTAACATTACTCAGTGGGCCT1#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFA简介原始数据处理表达矩阵处理和可视化细胞类型注释实例分析>
原始数据文件形式 简介 原始数据处理 表达矩阵处理和可视化 细胞类型注释 实例分析 @A00228:279:HFWFVDMXX:1:1101:8486:1000 2:N:0:NCATTACT NACAAAGTCCCCCCCATAATACAGGGGGAGCCACTTGGGCAGGAGGCAGGGAGGGGTCCATTCCCCCTGGTGGGGCTGGTGGGGAGCTGTA + #FFFFFFFFFFFFFFF:FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFF @A00228:279:HFWFVDMXX:1:1101:10782:1000 2:N:0:NCATTACT NTTGCAGCTGAACTGGTAAACTTGTCCCTAAAGAGACATAAGAATGGTCAACTGGAATGTGGATTCATCTGTAACATTACTCAGTGGGCCT + #FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF @A00228:279:HFWFVDMXX:1:1101:8486:1000 1:N:0:NCATTACT NGTGATTAGCTGTACTCGTATGTAAGGT + #FFFFFFFFFFFFFFFFFFFFFFFFFFF @A00228:279:HFWFVDMXX:1:1101:10782:1000 1:N:0:NCATTACT NTCATGAAGTTTGGCTAGTTATGTTCAT + #FFFFFFFFFFFFFFFFFFFFFFFFFFF @A00228:279:HFWFVDMXX:1:1101:8486:1000 1:N:0:NCATTACT NCATTACT + #FFFFFFF @A00228:279:HFWFVDMXX:1:1101:10782:1000 1:N:0:NCATTACT NCATTACT + #FFFFFFF
:运行结果(html和zip)原始数据处理一质控(FASTQC)FastQCReport·运行命令fastgc -t 8 -o path/fastqcSummarysamplel Rl.fq samplel R2.fgBasic StatisticsPer base sequence quality参数Per tile sequence quality-o--outdir:输出路径Persequencequality scores-extract结果文件解压缩Perbase seguence content-noextract:结果文件压缩Per.sequence GC content-f--format:输入文件格式PerbaseN content-t--threads:线程数Sequence Length Distribution-c--contaminants:制定污染序列Sequence Duplication Levels-a--adapters:指定接头序列Overrepresented sequences-k--kmers:指定kmers长度(2-10bp,默认7bp)AdapterContent安静模式-q--quiet:KmerContent>简介原始数据处理表达矩阵处理和可视化细胞类型注释实例分析
14 原始数据处理——质控(FASTQC) 简介 原始数据处理 表达矩阵处理和可视化 细胞类型注释 实例分析 fastqc -t 8 -o path/fastqc sample1_R1.fq sample1_R2.fq • 运行命令 • 参数 -o -outdir:输出路径 -extract:结果文件解压缩 -noextract:结果文件压缩 -f -format:输入文件格式 -t -threads:线程数 -c -contaminants:制定污染序列 -a -adapters:指定接头序列 -k -kmers:指定kmers长度(2-10bp,默认7bp) -q -quiet: 安静模式 • 运行结果(html和zip)
原始数据处理一FASTQC结果解读Perbase sequencequalityBasic StatisticsMeasureValueQuality scores across all bases (Sanger / Ilumina 1.9 encoding38SR5345622.faatqFilenane36FiletypeConventiomal base ealls3400000Sanger /Ilwina 1.9Encoding247613077Total Sequences0Sequencesflageed st pooxquality02665Sequence length26so6424Adapter Content2A820aSrag18-16114210(0412345678911131517192123252729313335373941434547495153555759616365146709-迎路Position in read (bp)15>简介实例分析》原始数据处理表达矩阵处理和可视化细胞类型注释
15 原始数据处理——FASTQC结果解读 简介 原始数据处理 表达矩阵处理和可视化 细胞类型注释 实例分析
原始数据处理不同测序平台的BC和UMIFinalLibraryStructure:10X V217:8R1:26Sample10XBC+UMIIndexP5Read110xUMIRead 2: 98*Read2P7Poly(dT)VNBarcodeInsertChromiumSingleCell3'GeneExpression Library10XV317:8Read1:28Sample0XBC+UMIndexP510xTruSeq Read1UMIPolyidT)VNTruSeq Read 2P7Read2:91BarcodeInsert5'LICL52L2CLS1CLS3UMIpoly(T)BD9129139818Length19223044-525360PositionRhapsodyFigure4.StructureofR1read>简介原始数据处理表达矩阵处理和可视化实例分析细胞类型注释
简介 原始数据处理 表达矩阵处理和可视化 细胞类型注释 实例分析 原始数据处理——不同测序平台的BC和UMI BD Rhapsody 10X V3 10X V2
原始数据处理分细胞(UMI-tools)R@A00228:279:HFWFVDMXX:1:1101:8486:10001:N:0:NCATTACTuMI-toolsNGTGATTAGCTGTACTCGTATGTAAGGT?#FFFFFFFFFFFFFFFFFFFFFFFFFFF@A00228:279:HFWFVDMXX:1:1101:10782:10001:N:0:NCATTACTNTCATGAAGTTTGGCTAGTTATGTTCATToolsfordealingwithUniqueMolecularIdentifiers#FFFFFFFFFFFFFFFFFFFFFFFFFFFStep 1:get dataR2Step2:Identifycorrectcellbarcodes@A00228:279:HFWFVDMXX:1:1101:8486:10002:N:0:NCATTACTNACAAAGTCCCCCCCATAATACAGGGGGAGCCACTTGGGCAGGAGGCAGGGAGGGGTCCATTCStep3:ExtractbarcdoesandUMlsandaddtoreadnamesCCCCTGGTGGGGCTGGTGGGGAGCTGTAxStep4:Mapreads#FFFFFFFFFFFFFFE:FFFFFFF:FFFFFFFFFFFFFFFFFEFFFFFFFE:FFFEFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFStep5:Assignreadstogenes@A00228:279:HFWFVDMXX:1:1101:10782:10002:N:0:NCATTACTNTTGCAGCTGAACTGGTAAACTTGTCCCTAAAGAGACATAAGAATGGTCAACTGGAATGTGGAStep6:CountUMlspergenepercellTTCATCTGTAACATTACTCAGTGGGCCT#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFFFFFFFFEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF>简介实例分析》原始数据处理表达矩阵处理和可视化细胞类型注释
原始数据处理——分细胞(UMI-tools) 简介 原始数据处理 表达矩阵处理和可视化 细胞类型注释 实例分析 Step 1: get data Step 2: Identify correct cell barcodes Step 3: Extract barcdoes and UMIs and add to read names Step 4: Map reads Step 5: Assign reads to genes Step 6: Count UMIs per gene per cell @A00228:279:HFWFVDMXX:1:1101:8486:1000 2:N:0:NCATTACT NACAAAGTCCCCCCCATAATACAGGGGGAGCCACTTGGGCAGGAGGCAGGGAGGGGTCCATTC CCCCTGGTGGGGCTGGTGGGGAGCTGTA + #FFFFFFFFFFFFFFF:FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF FFFFF:FFFFFFFFFFFFFFFFFFFFFF @A00228:279:HFWFVDMXX:1:1101:10782:1000 2:N:0:NCATTACT NTTGCAGCTGAACTGGTAAACTTGTCCCTAAAGAGACATAAGAATGGTCAACTGGAATGTGGA TTCATCTGTAACATTACTCAGTGGGCCT + #FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFFFFFFFFFFFFFF @A00228:279:HFWFVDMXX:1:1101:8486:1000 1:N:0:NCATTACT NGTGATTAGCTGTACTCGTATGTAAGGT + #FFFFFFFFFFFFFFFFFFFFFFFFFFF @A00228:279:HFWFVDMXX:1:1101:10782:1000 1:N:0:NCATTACT NTCATGAAGTTTGGCTAGTTATGTTCAT + #FFFFFFFFFFFFFFFFFFFFFFFFFFF R1 R2