Repertoire Analysis FAQ
- What is immune repertoire?
- The major immune cells, T and B cells, have different individual specificities, determined by the T- or B-cell receptors expressed on their surfaces, that allow them to respond to a wide variety of antigens. A collection of lymphocytes characterized by TCR and BCR is called a TCR/BCR repertoire. "Repertoire" is a French word, synonymous with "Repertory."
- Where was repertoire born from?
- The immune system, to react to various antigens, has a mechanism to create a diverse TCR or BCR by gene rearrangement and somatic hypermutation, and a repertoire is formed through these mechanisms. Gene reconstruction mainly occurs in lymphoid tissue (bone marrow for B cells and the thymus for T cells).
- What is the scale of repertoire?
- It is estimated that the TCR is 1018 and the BCR is 1014.
- What is a T-cell receptor?
- A T-cell receptor (TCR) is a heterodimer antigen receptor molecule expressed on the T-cell membrane. TCR recognizes antigen molecules that have bound to major histocompatibility complex (MHC) molecules. TCR is a major element in cellular immunity.
- What is a B-cell receptor?
- A B-cell receptor (BCR) and its secretory protein (antibody) is a molecule present on the B-cell membrane which causes a neutralization reaction after binding to an antigen. BCR consists of a heavy chain and a light chain and its function can be determined by a class switch existing in the constant region of the heavy chain. BCR is the major element in humoral immunity.
- What are the types of TCR?
- There are four types of TCR α chain, β chain, γ chain, and δ chain. The α and β chains and the γ and δ chains are expressed on cells in the form of heterodimers.
- What are the types of BCR?
- The types of BCR include BCR μ chain (IgM), γ chain (IgG), α chain (IgA), δ chain (IgD), and ε chain (IgE) in the heavy chain and the BCR κ chain (IgK) and λ chain (IgL) in the light chain. BCR is expressed or secreted on the cell membrane in a dimer formation where each chain of a heavy chain dimerizes with each chain of a light chain. To prevent confusion with chains in TCR, BCR chains are expressed using nomenclature such as IgG.
- What are V, J, and C genes?
- The gene compositions of TCR and BCR in the genome contain various gene fragments, V (variable), D (diversity), J (joining), and C (constant). During cell differentiation and maturation processes, gene rearrangement occurs which creates a large amount of diversity by the rearrangement of these genes.
- What is CDR3?
- CDR3 is the acronym for complementarity determining region 3. CDR3 is the region where antigen peptides directly bind in TCR while it directly binds to antigens in BCR; thus, the CDR3 coding gene is located between the V gene and J gene. The region is known to show great diversity as a result of the random insertion or deletion of bases due to gene rearrangement.
- What is repertoire analysis?
- The repertoire analysis is conducted to clarify the specificity and diversity of TCR or BCR by examining the frequency of V and J genes and the nucleotide sequence of the CDR3, consisting of TCR or BCR. To determine the nucleotide sequence of the TCR or BCR gene, we use an mRNA for gene amplification and an unbiased gene amplification method originally developed in our company. In addition, we can conduct the most exhaustive and accurate repertoire analysis.
- What targets are analyzed?
- Our repertoire analysis targets are α, β, γ, and δ chains for the T-cell receptor (TCR,); and heavy chain: μ (IgM), γ (IgG), α (IgA), δ (IgD), ε (IgE); and light chain: λ (IgL), κ (IgK) for B-cell receptor (BCR).
- Please tell us about Adaptor-Ligation PCR (AL-PCR).
- AL-PCR adds the adapter sequence at the 5′ end of DNA, which is then subjected to PCR amplification using a pair of primer sets consisting of an adapter primer and a constant region (C region) specific primer. Thereby, for all TCR or BCR genes, we can amplify them without bias. Please check the technology page for more information.
- Are rhere any advantages in comparison with other technologies?
- In principle, AL-PCR, adopted by our company, does not cause the PCR bias at the time of gene amplification. Thus, it correlates well with the results of the repertoire analysis obtained using techniques such as flow cytometry (FACS).
In contrast, for a generally used Multiplex PCR method which uses several primers specific for the V and J genes, PCR bias occurs by the difference in amplification efficiency between the primers. In addition, for BCR, because somatic hypermutation occurs frequently in the V and J genes, mismatches between the designed primers and templates are inevitable and thus PCR amplification failure occurs in some parts of the gene. In AL-PCR, because it does not require a designed primer that corresponds to the region where mutations can occur, a more accurate repertoire can be obtained, which is advantageous.
- What is the difference between FACS and your analyses?
- Repertoire analysis reagents commercially available for use in FACS analysis today include a panel of antibodies against the human TCRVβ chain. This antibody panel supports ~70% of the current Vβ chain, but very few are against the Vα chain. In addition, in FACS, because numerous cells are required, very few types of samples can be analyzed; in addition, FACS is not suitable for analyzing cells infiltrating tissues. In our repertoire analysis, because the total RNA extracted from the sample is used, analysis of various samples is possible.
By a large scale of sequencing using the next-generation sequencing, homology search with a dedicated database, translation of CDR3 to amino acid sequence, and aggregation of the number of reads using special software can be performed. Therefore, frequency analysis of all the V and J genes and ranking analysis of all clones with CDR3 amino acid sequence will be possible. In addition, we have confirmed that the results of our repertoire analysis correlate well with those of FACS analysis. Therefore, customers having repertoire data analyzed by FACS can use our service with confidence as well.
- Please tell us about the sequencer you are using.
- We use a MiSeq manufactured by Illumina.
- How do you analyze the sequence data?
- After sequencing is complete, it is automatically analyzed using a repertoire analysis software (Repertoire Genesis), originally developed by our company. A homology search is performed in each sequence read with the database of the V and J genes; furthermore, the amino acid sequence in the CDR3 is specified. The search aggregates the reads with a common combination to determine the level of the clones present in the specimen. We can also generate a 3D graph for cross-tabulation in the V and J genes.
For useful CDR3 length analysis to analyze clonality, using the CDR3 sequences obtained from the next-generation repertoire analysis, it is possible to generate results without conducting digital CDR3 Spectratyping.
- We do not know the TCRV chain gene name well
- There are multiple nomenclatures for the TCRV chain, and each of them has different orders. Please check the comparison table by IMGT. Although the notation such as “Vβ” was previously used for FACS analysis, IMGT classification, which is a numbering system based on the location of the gene on the genome (referred to as TRAV or TRBV), is commonly used now. We also use the IMGT classification for notation.
- Is analysis software original?
- We have developed a dedicated, fully automatic repertoire analysis program called Repertoire Genesis to process the data output from the next-generation sequencing at high speed and accuracy.
For Repertoire Analysis Request
- What type of samples can be analyzed?
- Samples that contain T cells for TCR repertoire analysis and B cells for BCR repertoire analysis should be used. In addition, total RNA in both samples should not be degraded. RNAs are easily degraded; therefore cells used for assays should be kept alive or should be immersed or lysed in RNA stabilizing solution. Please confirm this in the sample preparation manual provided by Repertoire Genesis Inc., which explains in detail how to handle blood, tissue, cultured cell, sorted cell samples, etc.
- How much of the specimen is needed?
- In the repertoire analysis, it is assumed that a sufficient number of lymphocytes (the targets for analysis) is included in the specimen. For a healthy individual, 5 mL of whole blood (i.e., 5 x 106 of peripheral blood mononuclear cells) and ~100 mg of tissue are generally required, but for sites other than lymphoid tissues (e.g., skin tissue), more amount of specimen is required.
If you sort certain cell populations including CD4+ or CD8+ cells, it will be possible to analyze by 1 x 104-6 cells. If you consult us for analysis or regarding the amount of specimen required, we will individually convey an appropriate way.
- Can you conduct repertoire analysis with RNA extracted by a customer?
- Although this is possible before we begin the repertoire analysis, we conduct a quality check using Agilent TapeStation at our company. If the RINe value is extremely low (1–5) (and favorable results may not be obtained), we contact the customer before analysis. If you cancel the analysis, the customer is charged only for the cost of the quality check.
- Can you conduct a repertoire analysis using genomic DNA?
- Our repertoire analysis is only suitable for total RNA. Unfortunately, we cannot analyze genomic DNA.
- How to store the tissue for analysis
- We recommend that you store the tissue immersed in RNA stabilization reagent (RNAlater). For more information, please check our protocol of tissue collection.
- Can the submitted specimen be returned after analysis?
- In principle, we cannot return the specimen. With regard to the total RNA extracted from the sample, we can nevertheless return the remainder to you.
- How do I send a specimen to your company?
- Please send it as per our sample shipping protocol.
- What is the delivery time required?
- We generally deliver the results in 1–2 months. We estimate the delivery time after we receive your specimen, and let you know the analysis schedule; however, the schedule may be changed depending on the schedule of routine analysis with sequencing equipment. If you require the results earlier (e.g., for a conference or because of the paper submission deadline), please contact us.
- Is it possible to occupy the sequencer?
- Because numerous reads can be obtained in the next-generation sequencing, tag sequences (index sequences) are added for multiple samples before sequencing. Therefore, if the number of specimens is less than that required for a single run, we sequence multiple samples. If you want to occupy the analyzer, please consult us. We will provide the estimate individually.
Evaluation of The Analysis Results
- How many reads are required?
- Required number of reads differs depending on the object for analysis. For a highly diverse sample, to prove that a particular TCR content is less than 0.001%, more than 100,000 effective reads are required, and the actual required total number of reads is 1.5–2.0 times more than the number of the effective reads. On the other hand, if you want to determine TCR content from the established T-cell clone, ~1,000 reads are considered to be sufficient. Analysis may be adjusted if you consult us for the target at the time of order.
- How does one read the analysis results?
- The 2D graph shows what each V and J gene is being used for TCR and BCR in the specimen. Among the control samples, comparing the frequency of gene usage and finding the outliers is possible. The 3D graph of the VJ gene shows the frequency of usage of the combination of each VJ gene in TCR and BCR. Hence, you can see the bird’s eye view of the entire repertoire. From the increase of specific VJ genes, it is also possible to detect the change in clonality. The aggregate ranking shows a rank table ordered by the copy number of unique reads of TCR or BCR present in the specimen. In the case of specimens with high diversity, it consists of only the low copy number of the reads, whereas with a high clonality specimen, a small number of high copies of the reads is detected. Using the CDR3 sequence of the reads of interest, it is possible to conduct various comparisons as per the analysis target and purpose (e.g., whether a particular clone is present in the other samples or whether it increased or decreased). Please consult us if you have any further questions.
- Why is the peak higher in the 3D graph than in the top-ranking?
- The 3D graph of the VJ gene shows the frequency of genes with the same combination of the VJ gene. Even the same combination of the VJ gene contains genes with different CDR3 sequences; therefore, the high peak in the 3D graph does not necessarily indicate the high presence of a clone. If you want to evaluate clonality, please also confirm the results of the final ranking.
- What types of data will be delivered to us?
- Delivered data include the original sequence data (Fastq), an Excel report file for each sample as output data using the repertoire analysis software, and a complete report file (PDF) where all of the data are gathered into one file. Clone files enable comparison analysis to be easily conducted between sample clones, and a stat file displaying diversity index between samples is also included. Data may be delivered either by DVD media, USB memory, or HDD depending on the number of samples. A printed report of the delivered material combining all data into one file will also be included.
- What types of analyses are available as options?
- Optional services at the experimental level include expression analysis (qPCR) of immune-related genes using leftover samples from repertoire analysis where RNA has been extracted, gene transfer experiments based on sequence data obtained by repertoire analysis, etc. Please inform us of any requests. Secondary analysis services such as data analysis include a comparison between samples based on primary analysis results, preparation of publications (with or without fee), and full-length sequence prediction analysis of specific clones, for the enrichment of the repertoire analysis.
- How should I perform the comparison between specimens?
- You can compare the frequency of usage of the V and J genes between samples using the 2D or 3D graph. You can efficiently track the clone of interest in ranking data by searching the CDR3 sequence on the entire sequence reads in the analyte of interest. Please consult us if you have any questions.
- What is a Neoepitope?
- The immune system functions to remove foreign objects which have invaded or developed in body. To precisely remove a foreign object, T cells detect an antigen determinant (epitope) as a specific region on a foreign object. Normal cells can develop into cancer cells caused by a mutation occurring in the genomic sequence of the normal cells, causing its phenotype to become abnormal. This causes cancer cells to be recognized as foreign objects. The newly (neo) developed antigen determinant (epitope) in the cancer cell is called a neoepitope.
- What is neoepitope analysis?
- Neoepitope analysis can analyze and specify mutations in genomic DNA developed in cancer cells and the possibility of mutations to cause antigen presentation to immune cells.
- What can be analyzed?
- To analyze mutations in the genomic sequence of cancer cells, Exome-seq is performed. In addition, RNA-seq is performed to determine if the mutation is functionally expressed. The genome of each individual contains SNPs and determination of changes in expression levels requires a control. Therefore, these analyses are also performed on normal tissues in a similar manner. As mutation sites may cause antigenicity to immune cells at the peptide level, it is necessary to confirm a prototype of leukocyte antigen (HLA), calculate its binding rate by bioinformatics, and drill down to identify mutated peptides as candidates for neoepitopes.
- Is this type of analysis superior compared to other technologies?
The search for antigens of cancer cells until now has identified antigens which are common to everybody from the perspective of drug discovery. However, many of the molecules identified in this way are known to be expressed at low levels even in normal tissues; thus, risk of developing autoimmune diseases was still present.
Neoepitope analysis provided by Repertoire Genesis Inc. identifies cancer cell specific variant peptides which are completely absent in normal cells so the potential for developing adverse effects can be eliminated. Moreover, searching common variant peptides in the same disease can be made possible by viewing the HLA axis amongst variant peptides that might first appear to be individually different.
- Is the analysis software original?
- Neoepitope analysis specific program is uniquely developed to process output data from next generation sequencing fully automatically at high speeds with high efficiency.
- What kind of samples can be analyzed?
- Cancer cells (cancer tissues) derived from humans or animals as well as normal cells (such as blood) are collected as samples. To conduct complete neoepitope analysis, both total RNA and gDNA are necessary.
- How much sample material is needed?
- For cancer tissues, approximately 100mg (the size of a soybean) sample is needed, while approximately 5mL for blood sample is needed. In order to extract total RNA and gDNA, it is desired that tissue samples are kept immersed in RNAlater and that blood samples are collected in a vacuum blood collection tube containing anticoagulant.
- Can DNA-RNA samples extracted at an individual lab be analyzed?
- It is possible; however, the quality of samples will need to be checked by Repertoire Genesis Inc. using Tapestation and nano drop before initiating neoepitope analysis. Analysis requires a total amount of more than 3μg of each total RNA and gDNA. If the amount is less than 3μg, favorable results may not be obtained. Therefore, we will be contacting you before performing analysis in that case. If a customer decides to cancel the analyses, the costs for quality checks will be billed.
- How should tissue and blood samples be stored?
- If the sample is cancer tissue, the tissue needs to be immersed in RNAlater immediately after extraction to stabilize total RNA. There are several methods used to collect the blood used as normal tissue. If experimental equipment is available and pretreatment of samples is possible, it is desirable to isolate PBMCs and dissolve them in Trizol. It is possible to dissolve whole blood in Trizol, however, insufficient lysis may occur depending on blood condition. Although gDNA can be extracted from frozen whole blood, use of frozen blood will dramatically reduce quality of total RNA.
Therefore, in the case of total RNA extraction, a Paxgene vacuum collection tube specially designed for RNA extraction should be used. For those institutions where pretreatment cannot be performed, blood samples collected in a vacuum collection tube containing anticoagulant should be sent to us immediately after sample collection at room temperature. We will conduct the extraction process. Please contact us before actual sample collection.
- Can you return leftover samples to us?
- Samples sent to us will be stored at our facility for 3 months after receiving and then routinely discarded. If you would prefer to retain samples, please contact us.
- How can I ship samples?
- Samples in RNAlater, Trizol, frozen blood (for gDNA extraction), and Paxgene need to be shipped to us frozen delivery. Blood samples collected in a vacuum tube containing anticoagulant need to be shipped us at room temperature immediately after collection.
- What is the delivery schedule?
- It usually takes approximately 2 months until items can be delivered. Clients will be notified of approximate delivery after planning an analysis schedule at the time of receiving samples. However, the delivery date may fluctuate depending on how busy the analytical instruments are. Clients who need to receive results as soon as possible due to time constraints such as a deadline for meetings or publications, please consult with us.
- Can I order analysis using a sequencer solely for our samples?
- A next generation sequencer can process a large number of reads; therefore, multiple samples in a mixture where index sequence is attached to each sample are sequenced all at once. Therefore, if the sample number is insufficient to fill one run, the sample will routinely be run with other samples.
Evaluation of Sequence Data
- How long is the maximum read length?
- To clearly identify variant peptides, sequence analysis is performed together with Exome-seq and RNA-seq aiming to read over 10Gbp. Sequence analysis is performed on multiple samples at the same time; therefore, the designation of read numbers is not possible.
- How can the analysis results be interpreted?
- In addition to client information who submitted samples and sample information, the neoepitope analysis result report contains variant peptides identified in each sample, HLA type which can bind to each variant peptide, FPKM value which indicates expression level of each variant peptide, and a list of neoepitope ranking predicted from an IC50 value, indicating that intermolecular coupling force are included.
- What type of data will be delivered?
- Data containing Fastq data obtained from Exome-seq and RNA-seq and ranking data containing variant peptides obtained by using neoepitope analysis software will be delivered.
- What types of analysis will be available as options?
- Analyses of Indel, Long Indel, Splicing variant, etc. based on data obtained from Exome-seq and RNA-seq are available. Please contact us for other services.
- How should comparison between samples be performed?
- To evaluate neoepitope analysis results, HLA types, genes containing nucleotides coding for variant peptides, mutation sites, expression levels of the genes, and changes in intermolecular coupling force caused by the mutation are considered. After performing neoepitope analysis for specific disease followed by evaluating the above-mentioned categories, new discoveries may be possible.
Flora Analysis FAQ
- What are bacterial flora?
- There are a wide variety of bacteria in nature. These bacteria rarely exist as single species in a given environment but they exist as a group of a wide variety of bacteria. For example, it is known that bacteria in the feces, mouth, soil, and waterways consist of over 1000 different species of bacterium. This type of bacterial distribution is called bacterial flora.
- What is bacterial flora analysis?
- It is an analytical method to identify species and bacterial distributions in a given environment.
- What will be analysis objects?
- To comprehensively and efficiently analyze the type and distribution of bacteria, sequence analysis of 16S rRNA genes of extracted bacterial genomic DNA from collected samples is performed, in which unique sequence is conserved regardless of bacterial specie. Bacterial flora can be revealed based on slight differences in sequence between bacterial species.
- Where is the region to be analyzed in 16S rRNA?
- We analyze V1-V2 or V3-V4 region.
- Is the technology superior to other technologies?
There are various methods to analyze bacterial flora, such as growing live bacteria in multiple selective media to analyze growth in the selective media for identification, PCR amplification followed by digesting the PCR products with restriction enzymes and checking bands generated after digests using electrophoresis, and individual nucleotide sequencing of each colony obtained by culturing diluted bacterial suspension using a capillary sequencer.
All of these methods have limitations in scale and determination for conducting bacterial flora analysis.
Currently, methods where gDNAs from bacteria occurring in a single environment are extracted collectively, 16S rRNA sequences are amplified, and large amounts of comprehensive sequence data are obtained using next generation sequencing, have become main stream. Therefore, knowledge in bioinformatics tools (computational science) has become necessary.
We are equipped with all experimental instruments and data analysis servers necessary for bacterial flora analyses (Flora Genesis as the service name). Therefore, clients need only to provide samples in order to obtain analysis data produced by cutting-edge technologies.
- Is this different from metagenome analysis?
- So-called metagenome analysis is a method to analyze genomes of entire bacteria present in a sample using next generation sequencing. While an extremely large amount of information can be obtained from metagenomic analysis, the cost is high due to increased sequencer use for a single sample. 16S rRNA bacterial flora analysis also belongs partly to metagenomic analysis in a broad sense. However, 16S rRNA bacterial flora analysis sequences only a part of a gene. Therefore, multiple samples can be run in one next generation sequence analysis. Thus, the cost becomes low. It is necessary to use different technologies depending on what the research requires.
- What type of sequencer is used?
- Miseq by Illumina Inc.
- Is analysis software original?
- We have been developing the bacterial flora-specific program (Flora Genesis) to process data generated by next generation sequencer automatically, at high speeds and with high accuracy.
Bacterial Flora Analysis Requests
- What types of a sample can be analyzed?
- Any sample containing bacteria can essentially be analyzed. The main samples are feces and saliva. Besides these, human and animal tissues infected with bacteria can be analyzed. However, because of large amounts of host gDNA, efficiency of bacterial 16S rRNA amplification may be low. Please contact us for inquiries.
- How much volume is required for analyses?
- It depends on bacterial concentration in a sample. Approximately 100mg (approximately the size of a soybean) for feces and approximately 500L for saliva are necessary. The amount necessary for analysis depends on the type of sample, so please contact us for inquiries.
- Can DNA samples extracted at an individual lab be analyzed?
- Yes, this is possible. However, the quality of the samples will need to be checked by Repertoire Genesis, Inc. using nano drop before initiating bacterial flora analysis. If the concentration is extremely low, favorable results may not be obtained. In that case we will be contacting you. If the customer decides to cancel analyses, the costs for quality checks will be billed.
- How should samples be stored?
- The distribution of bacteria in samples for bacterial flora analysis changes as bacteria grow. Therefore, we recommend freezing samples as soon as possible after collecting samples. Furthermore, please follow the manufacturer’s suggested protocol when a collection kit designed for feces is used.
- Can you return leftover samples to us?
- Samples sent to us will be stored at our facility for 3 months after receiving and then discarded as routine. If you prefer to retain samples, please contact us.
- How should sample be prepared and shipped?
- Handling of samples is the same as mentioned above in storage method, so send us samples frozen. Please follow the manufacturer’s suggested protocol when using a specialized collection kit.
- What is the delivery schedule?
- It usually takes approximately 1~2 months until delivery of the items. Clients will be notified of approximate delivery after planning an analysis schedule at the time of receiving samples. However, the delivery date may fluctuate depending on how busy the analytical instruments are. Customers who need to receive results as soon as possible due to time constraint such as a deadline for meetings or publications, please consult with us.
- Can I order analysis using a sequencer solely for our sample?
- Next generation sequencer can process a large number of reads; therefore, multiple samples in a mixture where an index sequence is attached to each sample are sequenced all at once. Therefore, if the sample number is insufficient to fill one run, the sample will routinely be run with other samples. Customers who prefer samples to be analyzed using a sequencer solely to themselves should contact us. We will send an estimate of the cost.
Evaluation of Sequence Data
- How long is the maximum read?
- Sequence analysis is performed with a goal of 5~10x104 reads. Sequence analysis is performed on multiple samples simultaneously. Therefore, a designation of read numbers cannot be possible. However, it is possible to limit the number of reads at the time of sequencing.
- How to interpret analysis results.
- The bacterial flora analysis report includes detailed information including bacterial flora data of the individual sample such as recently identified distribution data and ranking data or comparison data, which compares differences between samples, in addition to client’s information and sample information.
- How is sequence data analyzed?
- Based on Fastq data generated by sequence analysis, primary analyses such as phylogenetic tree analysis, distribution mapping, and ranking are performed using our original software for bacterial flora analysis.
- What type of data will be delivered?
- The primary analysis data generated by the bacterial flora analysis software and Fastq data generated by sequence will be delivered to clients. If a client requests secondary analysis as an option, secondary analysis data will also be delivered.
- What type of analyses can be included as an option?
- Analyses such as diversity analysis, main component analysis, and multivariate analysis are available. By conducting complex analysis, such as identifying a relationship between clinical data and other data together with bacterial flora analysis, meaningful conclusions are able to be drawn.
- How can comparison between samples be performed?
- Clients who submitted multiple samples will receive multiple distribution maps according to the number of samples submitted. We also recommend comparing types of bacteria by % based on the ranking data. These analyses can be performed in the primary analysis. However, please contact us for more complex secondary analyses such as multivariate analysis.