31 May 2009

What is "Large"?

I saw a link this morning on Twitter to an article by Tamura Jones about GEDCOM file sizes and what a "large" file is. It's a great article, comparing different genealogy websites and vendors and their definition of a large GEDCOM file. In the end, the author defines a file of 5,000 individuals as small and 25,000 as medium.

I can understand this number in terms of file size and what technology should be able to accomplish. The article really puts into perspective the different capabilities of each site/vendor and the way each one creates their own "large" definition. And the article chastises those who only support smaller file sizes and call them large. Makes a lot of sense.

But, this article got me thinking. I'm curious: how large is the average GEDCOM of a hobbyist researcher? Does the average researcher ever reach "medium" size in their own GEDCOM? It seems so large to me!

Personally, my GEDCOM has 2,430 individuals and is 980KB (using Reunion). I suppose it would be defined as x-small. :)

I have three lines that go back 10 generations, my shortest is only 5, and the average is 8 generations. Lately I've been more concerned with adding supporting research than expanding my tree. I do research collateral lines, though I often stop after two or three generations away from my direct line.

So, to anyone reading this: How big is your GEDCOM and what is your research strategy?


Greta Koehl said...

This is an interesting subject and you ask a good question. My GEDCOM (I also use Reunion) has 6297 people in it right now (that includes a handful of people whom I have "disconnected" and probably 100+ people with no names - posited spouses and children with no names). My research strategy is, in general, to go back a generation at a time rather than taking a single line all the way back at once. I'm now doing the gg-grandparent level and this is the last level for which I will try to do "all descendants of"; the exception is the line that is my main research focus - I'll do all the descendants of the ggg-grandfather for that one. The rest of my research efforts will be on brickwalls and families that branch off from well-researched lines but are not well researched themselves.

Apple said...

My maternal file has something like 35,000 individuals. I'm not related to them all, some are cousins of cousins or not related at all. Sometimes I enter information just because I have it, who knows it may come in handy later. I have been trying to compile a list of all of the descendants of two different ancestors and that causes the file to swell also. My paternal file and my husband's line are both under 4,000 and much easier to work with!

Randy Seaver said...

Frankly, I like girls with big GEDCOMs... oh, that wasn't the question.

To me, a "big" GEDCOM is one that causes FTM 2009 from opening for more than 30 seconds. That number is around 10,000 I think.

I'll have to go play with it and blog about it. Thanks for the link to TJ's article. I read it last night.


Related Posts with Thumbnails