Antibody & T-Cell Receptor Data Integration Workshop

Antibody and T-Cell Receptor Data Integration Project

Planning Meeting September 12-15, 2014

Organizing Committee: Jamie Scott & Felix Breden, SFU; Tom Kepler, Boston University

Schedule: All meetings will be held in the IRMACS Centre.
Friday, Sept 12th  
7:00 - 8:00 pm Introductory Remarks
Saturday, Sept 13th
9:00 - 12:30 pm Planning Meeting for Workshop I: Sequencing Immune Receptor Repertories
1:30 - 4:30 pm Planning Meeting for Workshop II: Analysis Tools for Individual Repertories 
Sunday, Sept 14th  
9:00 - 12:30 Planning Meeting for Workshop III: Linking immune receptor databases with other datasets, and coordinating across studies
1:30 - 4:30 pm Planning Meeting for Workshop IV: Ethical and legal issues involving sharing of human and animal NGS data
Monday, Sept 15th  
9:00 - 12:30 pm Final meeting to review the plan for May 2015 meeting: Action Plans for each Workshop (including overarching issues to be addressed by each Workshop); Roles for Workshop Leaders and Advisors; Assign Workshop Leaders and Advisors; Communications Committee for advertising May meeting and communicating the results/recommendations of the May meeting to appropriate bodies (funding agencies, journal, societies)

Our main goals for this September’s Planning Meeting are to set the topics and organize the larger Community Meeting for May 29th - June 1st, 2015.  We will determine the specific goals for the Community Meeting, and for the initiative overall, and set the agenda for the Community Meeting.  Below we present 4 proposed main topics for the Community Meeting, but these are just proposed topics; please come prepared to expand on these topics, or suggest changes. At the Planning Meeting we will also identify two people to be the discussion leaders and rapporteurs for each workshop.  

     The goal of the proposed set of workshops in the May 2015 Community Meeting is to bring together all stakeholders in the use of NGS technologies to study antibody (Ab)/B-cell and T-cell receptor (TcR) repertoires.  Attendees for the Community Meeting will not be restricted to the HIV/AIDS field, but will include experts in vaccine development against a broad range of infectious agents, T-cell-based cancer therapeutics, and Ab-based therapeutics against cancer, autoimmune and inflammatory diseases, along with representatives from the diversity of communities invested in research (e.g., from scientific journals, public granting agencies, bioinformatic, biotechnology and pharmaceutical industries, intellectual-property-based legal firms, university and other ethics boards/councils, patient and community advocacy groups).

     The Community Meeting workshops are designed to develop standards and recommendations for: (i) obtaining, analyzing, curating and comparing/sharing NGS datasets, (ii) using and validating tools for analyzing Ab/TcR repertoire NGS data; (iii) relating NGS datasets to other “big data” sets, such as microarray, flow cytometric, and MiSeq gene-expression data, and (iv) legal and ethical issues involving the use and sharing of big data sets derived from human sources.  Each workshop will be led by two experts who will prepare and circulate beforehand topics for the workshop speakers to address.  We propose that most of the workshop time be devoted to discussion.  Workshop leaders will develop recommendations proceeding from their workshops, and the final session on Monday will comprise summaries presented by the workshop leaders from which the group as a whole will reach consensus on final recommendations and action plans.  The proceedings of the workshops, including the recommendations and action plans, will be published for the larger scientific community.  At that time we will also decide on future workshop/meetings to continue this initiative.  Please look at the following Planning Meeting description, keeping in mind that it is organized to identify Workshop Leaders and develop topics for the Community Meeting in May 2015, which are also described.


 First Full Day: Generation and Analysis of Immune Receptor Repertoires

Planning for Workshop 1.  Sequencing immune receptor repertoires. 

Based on the goals for the Community Workshop 1, described below, this planning workshop will be divided into two sessions. Session 1 will identify quality issues involving cDNA sample preparation, PCR, and NGS sequencing.  Session 2 will discuss use of IMGT/HighV-QUEST, SODA, IgBlast, JoinSolver, etc., for analyzing the Ab/TcR NGS data, and their comparison. 

Community Workshop 1.   Sequencing immune receptor repertoires. 

This workshop will focus on the initial stage of sequencing receptors from bulk and single B and T cells.  Topics for review include V, D, J germline genes, VH CDR-H3 length, secondary rearrangement/receptor editing, somatic mutations, differences between NGS of TcR and Ab repertoires, and relevant nomenclature will be introduced. Methods of cDNA preparation will be discussed, such as 5’-RACE and primer-based amplification, focusing on the ability of these methods to accurately reflect the repertoire.  Following that, current sequence technologies will be described as applied to receptor repertoires, along with the sorts of error/bias that each introduces into the data, bioinformatic tools for estimating and correcting errors, and molecular methods for doing so, such as doping the repertoire with known sets of sequences to estimate error rates.  Other important topics will include tag technology for multiplexing repertoires, and the limits to high-throughput sequencing of VH/VL pairs from single cells.  This workshop will also review methods for assigning germline genes, VH CDR3 length, somatic mutations, and other parameters of immune receptor sequences; such tools include IMGT/HighV-QUEST, SODA, IgBlast, JoinSolver, etc. Differences between TcR and Ab repertoire analyses will also be discussed.

      The standards to be discussed include: (i) Reporting sequence quality, (ii) Sources of error and bias, (iii) Reporting sequences removed from the database during “clean up”, (iv) Doping standards, (v) Recommending that journals and funding agencies require repertoire sequences to be publicly reported as a prerequisite for publication/funding, following accepted community standards.

Planning for Workshop 2.  Analysis tools for individual repertoires.  Session 1 in this workshop will discuss NGS data analysis tools and tool-packages, and approaches to comparing and validating them. Session 2 will focus on clonotype definition and analysis, and the comparison and validation of these tools.

Workshop 2.  Analysis tools for individual repertoires.

This workshop will first review tools that are available for the analysis of individual Ab and TcR repertoires.  Among these tools, and specific to Ab repertoires, are tools that define clonal lineage, and go on to infer a common ancestor of that lineage, often with the goal of testing antigen binding by the common ancestor.  The currently available tools will be reviewed, along with the nomenclature and further refinement of clonal lineage analysis.  Efforts to use bioinformatics to infer VH and VL pairing will be examined, and compared to repertoires that are based on sequencing paired VH and VL from single cells. As many of the tools to be discussed were developed for Ab repertoires, their application to TcR repertoires will be of interest.  The application of phylogenetic techniques for examining co-evolution of immune receptor and pathogens, or selection on antibody sequences will be discussed.

     The projected outcome of this workshop is the development of best practices for evaluating and comparing these analysis tools.

 Second Full Day: NGS Databases, Linkage with Other Data Sets, and Sharing Data Sets

Planning Workshop 3.  Linking immune receptor databases with other datasets, and coordinating datasets across studies.  Session 1 will involve best practices for integrating NGS data with other metadata.  Session 2 will discuss different platforms for sharing and analyzing NGS data from different sources, and best practices for doing so.

Community Workshop 3.  Linking immune receptor databases with other datasets, and coordinating datasets across studies. 

This workshop will discuss associating immune receptor repertoires deduced with NGS with the other types of linked metadata from each individual under study, such as gene-expression data from microarrays or RNASeq, epigenetic signals, proteomics, serologic and metabolomic data, along with clinical, social and demographic data.  Also to be discussed are systems biology approaches that use NGS data, an example being the NGS analysis of FAC-sorted B and T cell populations, and linkage between B and T cell NGS datasets and comparative analyses.   A major focus of this workshop will be the set-up of “scientific gateways” that mediate the capture and query of data from multiple, distributed immune receptor databases, and their associated datasets.

     The projected outcome of this workshop will be standard data format for stored receptor sequences and associated data, and common formats for shared databases.  A process will be initiated for developing tools to link these datasets, and for statistically analyzing associations among data from the distributed data sets. 

Planning Workshop 4.  Ethical and legal issues involving sharing of human and animal NGS data.  Session 1 will focus on ethical considerations involving best practices for use of human NGS data from multiple sources, including ownership, confidentiality and data security. Session 2 will describe legal issues involved with sharing data from multiple sources, including the unique needs and constraints of industry

Community Workshop 4.  Ethical and legal issues involving sharing of human and animal NGS data.

This workshop will start with a review of ethical and legal issues involved in sharing databases, and how multiple database comparisons can increase the value of the data for biomedical research and patient care.  We will discuss identifying the stakeholders in the production and use of “big” databases, and inclusion of stakeholders (i.e., patients and community advocates, researchers, agencies, industry) in all phases of the research, including communication and intellectual property ownership.  Finally we will discuss means of creating partnerships among academic, industrial, and stakeholder communities involved in Big Data.  It will thus be important to include experts in the legal, ethical and knowledge translation fields in this workshop.

     The projected outcome of this workshop will be mechanisms for ethically and legally sharing NGS and associated data sets.

Planning Meeting. Final Day.  Issues to be addressed in each Workshop of the Community Meeting will be summarized and confirmed, with focus on developing specific recommendations out of each workshop. Pairs of Workshop Leaders will be assigned.

Community Meeting. Final Day. This will be a review by the workshop leaders of final recommendations, which were initiated in the workshops, and then discussed on a more casual basis among attendees over the weekend.  Thus, crucial to this final session of the conference will be the choice of workshop leaders, as being proactive in following up to discuss with attendees the recommendations coming out of their workshops.  Final recommendations will result out of this consensus-building process.

     Workshop leaders, conference organizers and journal editors, and representatives from granting agencies, and other stakeholders will remain in the afternoon to summarize the meeting, and plan further strategies for communicating the recommendations produced from the workshop. 

