Friday, 22 August 2014

clarifY DNA - a new Y-SNP analysis service

clarifY DNA is a new Y-DNA analysis service from Chris Morley, a well respected citizen scientist in the genetic genealogy community who is best known for his Geno 2.0 subclade predictor and his experimental Geno 2.0 trees. The methodology is outlined in his white paper "An experimental computer-generated Y-chromosomal phylogeny, leveraging public Geno 2.0 results and the current ISOGG tree". The new service is a natural development from the Geno 2.0 tool and allows users to receive a computer-generated phylogeny based on next-generation sequencing results. The service is currently restricted to an analysis of Big Y VCF/BED files, but there are plans to add the Full Genomes test (from a text file output), and the Chromo2 test from BritainsDNA in due course. The analysis currently costs $30 which includes the initial analysis and a subscription providing further updates at least until the end of 2014.

It is first of all necessary to register for an account. Once your payment has been approved and you've uploaded your files the automated report can be generated. The reports are manually checked before being uploaded to the website and I understand the turnaround is usually within 24 hours though is often much quicker. Once the report is ready you can download the PDF file from the phylogenetic reports menu.
Here is the tree generated from my dad's Big Y files.
The tree is very clear and easy to understand.  It builds on the good work of the ISOGG Y-SNP tree but also provides a more provisional perspective. clarifYDNA communicates which aspects are accepted, which aspects are provisional, and which aspects are most in need of further investigation. The tree is also a vast improvement on the current Family Tree DNA haplotree. The FTDNA tree was produced in partnership with the Genographic Project but the cut-off date was November 2013 and the tree does not include any of the new SNPs identified from testing with Big Y, Full Genomes and Chromo 2. The FTDNA tree still shows my dad's most downstream SNP as Z12 (a branch of R1b-U106), yet he had already tested positive for Z12 prior to taking the Big Y test.

According to the clarifY DNA analysis my dad has 18 private SNPs (all the SNPs highlighed in orange on line 14), which is the same number of private SNPs identified by the U106 project team. For genealogical purposes it is of course these private SNPs which are of the most interest and in the long term, as more people get tested, in theory we should be able to establish precisely where all these private SNPs are positioned on the tree and we will have the complete branching process of our Cruwys/Cruse/Cruise tree right down to the last few hundred years.

The report includes some of the technical details about how the algorithm works which I've reproduced here for reference:
The contents of this report were produced by a computer algorithm. This report will be frequently re-generated as more information becomes available. The pilot-scale implementation of this algorithm is able to process a dataset of over 4000 Big Y kits (over 400 real and 3600 simulated) in one run. 
clarifY DNA’s automation capabilities analyse large Y-SNP datasets with great speed, great accuracy and great comprehensiveness. These facets are critical for: helping a testing company’s customers make informed SNP-ordering decisions; uniting customers and/or research participants with their most meaningful patrilineal matches; and, overall, scientific progress, customer satisfaction and further growth. 
All in all, clarifY DNA’s software is the key to truly realising the “Y Tree” in “Family Tree”.
The phylogenetic algorithm employed here was initially developed in June 2013 for Geno 2.0 data; see for similar reports (from an earlier version of the phylogenetic algorithm) leveraging public Geno 2.0 data. While this report represents a large advance over existing Y-DNA trees, please treat some aspects of this report as experimental and preliminary; some enhancements specific to next-generation sequencing have not been exhaustively tested, and there are several discrepancies over the definitions of high-level SNPs.
The service also provides the option to contact your closest "genetic neighbours" on your branch of the Y-tree. You can opt to make your kit number and e-mail address available to your neighbours or you can choose to remain anonymous. If you opt not to reveal your email address, your matches can still send you a message, routed through, and it is then up to you to decide whether or not to reply (thereby revealing your email address).

All in all this looks like a very promising new service which provides cutting edge haplogroup analysis in a report which distils the pertinent information into an easy to understand phylogenetic tree. The value of the service will grow as more users contribute their data, and I understand that further enhancements are in the pipeline. clarifY DNA will be of particular benefit to people who have taken the Big Y test but who do not have the advantage of participating in a haplogroup project with administrators and team members who are actively involved in the interpretation and analysis of Big Y results. Even if you have received a detailed analysis from your project admins the service is worthwhile for the clarity of the presentation of the tree which helps to put your results in context.

Disclosure: I was given a complimentary analysis of my dad's Big Y data to enable me to write this review.

Wednesday, 20 August 2014

John Cruwys and Sarah Chown of Tiverton, Devon

Richard Chilcott has very kindly sent me this wonderful photograph of John Cruwys (1860-1919), Sarah Chown/Quant (1851-1921) and their daughter Winifred May Cruwys (1897-1983). The photograph is from the collection of Leslie Cruwys and is published with his permission.
John's wife Sarah Chown had previously been married to John Henry Quant (1850-1881) by whom she had three children. She married John Cruwys in 1882 a year after the death of her first husband. John and Sarah had eight children together, and Winifred was the youngest of their children.

John Cruwys is from the Witheridge Cruwys tree which can be traced back to William Cruwys and Sarah Taylor who married on 2nd February 1820 in Witheridge, Devon. William is probably the William Purchase Cruys, who was baptised on 2nd November 1794 in Cheriton Fitzpaine and was the illegitimate son of Sarah Cruys.

John Cruwys worked as a stonemason. His family remember that he lost a leg following an accident. Indeed the accident might possibly have occurred in 1901 because in the 1901 census George was a patient at St George's Hospital in Hanover Square in London. I was told a story by a local resident that John had skills as a carpenter and that if he ever broke his artificial leg he would go home and make himself another one! It was thought that he made his legs from chair legs strapped on from the knee down. You can clearly see from the crease in John's trousers that he has an artificial left leg just below the knee.

We don't know the exact date when the photograph was taken but Winifred looks as though she's in her early teens which would date the photo to about 1910. In the 1911 census John and Sarah were living at Jurishayes Cottage in Tiverton and the photograph is almost certainly taken outside their house.

Tuesday, 29 July 2014

Family Tree DNA reduce the price of the Big Y test to $595

I wrote at the weekend about the new Y Prime test from Full Genomes which is designed as a competitor to Family Tree DNA's Big Y test. Family Tree DNA have now responded by announcing a permanent reduction in the price of their Big Y test, and they have also introduced a few new features to the Big Y display. Note that FTDNA's Big Y is only available to existing FTDNA customers. It's good to see some healthy competition in the Y-DNA testing market. Here is the text of the e-mail that was sent out to project administrators:

Dear Project Administrators,

We are excited to announce the release of a new feature to help Big Y testers refine their matches!  Now, you'll be able to easily filter out matches that aren't genealogically relevant to you.

Also, as part of this release we are permanently reducing the price of Big Y to $595.  
How it Works
The filter lists the subclades immediately upstream from the tester's terminal subclade.  When a subclade has been selected, a number appears next to the unselected subclades to indicate how far upstream or downstream they are from the selected subclade.

A subclade marked (+1) is the next clade upstream from the currently selected clade.  A subclade marked (-1) is the next clade downstream from the currently selected clade.  The number of matches available at each level is listed on the right side of the filter drop down. 

To help clarify the hierarchy of the subclades, the haplotree button has been updated to display subclades in the standard haplotree format.  The full tree can be viewed by clicking Go To Haplotree.  

Saturday, 26 July 2014

Full Genomes launches Y Prime - a new Y chromosome sequencing product

The following press release has been written by Full Genomes Corporation.

Full Genomes Corporation (FGC) is announcing today the introduction of a new Y chromosome sequencing product, dubbed Y Prime. The Y Prime test leverages recent technology advances to economically sequence large portions of a male's Y chromosome, enabling advanced, high-resolution tracing of direct paternal line ancestry.

FGC has worked with industry leaders to develop a new Y chromosome capture approach and has combined it with Illumina "next-gen" sequencing. The resulting data will be processed with the latest alignment algorithms to improve read mapping. The overall result is a cutting-edge product with Y chromosome coverage breadth that is close to that of FGC's original comprehensive Y sequencing product (now termed Y Elite), at a much lower cost. Additionally, the new product is priced lower than the leading competitor, while retaining a significant advantage in terms of quality and comprehensiveness.

FGC is releasing the following comparison statistics as estimates of test coverage based on Y Prime pilot results.

Y Prime will be offered at a standard price of $625. An introductory price of $599 is available for orders placed by August 31, 2014. Y Prime is currently available at the discounted introductory price through the Full Genomes website by ordering the Comprehensive Y test using the coupon code "YPRIME".

The new product is also expected to offer significant improvements in turnaround time for results. Testing will be performed by a U.S.-based sequencing facility.

Additionally, FGC has recently been developing new sample collection protocols, designed to reduce the frequency of delays due to the need for repeat sample collection.

Justin Loe, CEO of FGC, commented, "Our new product is consistent with our mission to deliver the best quality Y sequencing products at the most affordable prices possible, and to continue to innovate with new products targeted to the genetic genealogy community."

FGC will continue to offer the original comprehensive Y sequencing product, with sequencing performed at BGI, under the new name Y Elite. To help customers decide which product is right for them, FGC is releasing BED files to indicate the regions covered by representative tests; these are available at and can be used to determine whether a particular site or SNP of interest is likely to be covered by the test. Customers with questions may contact

DAK note: I am advised that Greg Magoon has further technical comparisons available, which are the raw data files from the pilot samples (BAM files and FGC analytical reports), that are also available for comparison for specialists.

Update 29 July 2014
Full Genomes have announced that the Y Prime test will be offered at a new low price of $589.