Searching for Persons in User Contributed Trees

By Geoffrey Slinker, Ancestry Employee
The postings on this site are my own and do not reflect the views or opinions of Ancestry.

Disclosure

The purpose of this post is to give insights on how I use the new searching experience for member tree data, how I construct searches, and how I interpret the results. Maybe my explanations will help you find what you are looking for.

Searching Public Member Trees

https://www.ancestry.com/search/collections/pubmembertrees/

Ancestry recently changed public / private member tree searching.
Be aware that you can only search for non-living persons. So, searching for yourself isn't expected to give results, and the same is true for any living relative or person.

Person Centric Searching

Even though the search is called "Public Member Trees" you are not searching for a tree. A member tree search would have fields for searching by a member's name or by a member's tree name.


This search is about finding persons that are represented in tree data that has been submitted by customers. Therefore it is a person search. That is important to understand.

Member tree persons may contain data that does not have a reference to an official document like a census. That doesn't mean the data is incorrect, it just means that there is no documentation to back up the data. Such data can help a user find information that can be insightful or inspire the user to search in different areas. For example, a member tree person might have a death place of Tennessee. You may have never considered that the person could have died there. Now you can start looking for records in Tennessee to support this possibility.

Duplicate Results

As a tree goes back we share many common ancestors. These common ancestors are duplicated over and over in many trees. For example, if you have a family tree and your sister has a family tree then your parents have been duplicated. In that case if you searched for your parent you should find two results for your father and two results for your mother. These are duplicate results.


Lincoln appears in over 22,000 trees. These 22,000 Lincolns may be different or may be mostly the same. Sometimes people put in incorrect data, or leave out data. Ancestry takes the search results and groups the results into "buckets" where the data seems to agree so that the user doesn't have to page through 10,000 pages of results to see something different.

The purpose of this grouping is to help you navigate through probable duplicates as fast as possible.



This grouping of duplicates can be referred to as "groups", "clusters", or even "buckets". The member tree persons that are very similar are put in the same group, or in other words in the same cluster, or they landed in the same bucket.

Exact vs Not Exact Searching

Exact means the results must have the value you are searching for.
Not Exact (Broad) means the results should have the value, and if it does sort / rank that result higher than results that do not have the value, but don't exclude results that do not have the value.

Let's go through some examples of using Exact with name searches.


In this search the "First & Middle Names" is marked as "exact" and the "Last Name" is not marked as exact. Birth and Death information was added to the search to reduce the result set so that it is easier to see how manipulating "Exact" affects names.

https://www.ancestry.com/search/collections/pubmembertrees/?name=Jonathan_Joanes&birth=1830_ohio-usa_38&death=_iowa-usa_18&birth_x=0-0-0_1-0&death_x=_1-0&name_x=1_psx

Notice that the results have "Jonathan" matched exactly. The last name "Joanes" is not exact and the results have two persons with the last name "Jones".




https://www.ancestry.com/search/collections/pubmembertrees/?name=Jonathan_Joanes&birth=1830_ohio-usa_38&death=_iowa-usa_18&birth_x=0-0-0_1-0&death_x=_1-0&name_x=1_1

Making the last name "Exact" returns no results (this could change, the system is continuously updating).


https://www.ancestry.com/search/collections/pubmembertrees/?name=Jonathan_Joanes&birth=1830_ohio-usa_38&death=_iowa-usa_18&birth_x=0-0-0_1-0&death_x=_1-0&name_x=1_s

Making the last name "Exact" with "Similar" means that the results must have the last name of Joanes or a name that is similar to Joanes.



https://www.ancestry.com/search/collections/pubmembertrees/?name=Jonathan_Joanes&birth=1830_ohio-usa_38&death=_iowa-usa_18&birth_x=0-0-0_1-0&death_x=_1-0&name_x=i_s

Changing the first name to "Exact" and "Initials" means the results must have the first name of Jonathan or a first name that starts with the letter "J". Jonathan's should come before J's, such as Jane or John. This ordering of Jonathan before names that match the initial is called Ranking. Ranking is how results are sorted. In this case Jonathan is considered more important and therefore ranked higher than Jane and John.


Change the search form to search for John Black. This will give another example.




https://www.ancestry.com/search/collections/pubmembertrees/?name=John_Black&birth=1820_ohio-usa_38&birth_x=0-0-0_1-0&name_x=s_s

In this search "Exact & similar" is checked for both "First & Middle Names" and for "Last Name".

The first result is John P Black and the last result is John Block. In the middle are two Jane's. Since John Black is what was searched for any John Black's will be sorted / ranked to the top.



https://www.ancestry.com/search/collections/pubmembertrees/?name=John_Black&birth=1820_ohio-usa_38&birth_x=0-0-0_1-0&name_x=_s

If the first name is NOT exact then any first name will be returned. If the result has John then it should be sorted / ranked higher than results that do not have John as a first name.




The last results have the last name Black, but the first name can be any value.

Exact means "must have". The results must have the value.

Not Exact means "should have". The results should have the value, but if it doesn't just rank the results that do have the value higher than the results that don't have the value.

If "Exact" is selected with options like "Similar" or "Sounds Like" the results must have the value or it must have a value that is similar or it must have a value that sounds like the search value. "Similar", "Sounds Like", "Initials" and other options are called Expansions. Results that only match on an expansion should sort / rank lower than a result that matches the term exactly.

Explanation of Grouping / Clustering

When a search is performed the results are shown with the most probable duplicates grouped / clustered together. The first member of the group / cluster is shown in the result panel with a link to "View all" of the results in that cluster. Sometimes the first member of the group might not look like it matches what you searched for.




In this search we want a last name that is exactly "Zebadee". So, why is the second result named "Zabanee", I said exactly "Zebadee"!

Click on "View all" and page through the results. There will be a version of the person with the name "Zabedee" in there. What this is showing is that there is uncertainty around this person's last name. This person is "also known as" by other names.




By looking at the cluster you can see that this person is known as "Sarah Grist", "Sarah Zebadee", "Sarah Zebuner", "Sarah Zebennee", and just "Sarah".

In this example, would you have ever thought to check out "Zebuner" as a possibility? Ideally this clustering is providing alternate versions of the same person.



Name Searching

Similar Names





https://www.ancestry.com/search/collections/pubmembertrees/?name=Sean_Green&birth=1969&death=_new+york-usa_35&birth_x=0-0-0&death_x=0-0-0_1-0&name_x=s_1

Sean and John are similar names. There are many variations of "John", including Ivan and Shane, it is a very popular name.

A birth year and death location is used to narrow the results so that it is obvious how the search changes when "Similar" is selected.



https://www.ancestry.com/search/collections/pubmembertrees/?name=Sean_Green&birth=1969&death=_new+york-usa_35&birth_x=0-0-0&death_x=0-0-0_1-0&name_x=1_1

Turning off "Similar" and only Sean is returned in the results.

Change the search query to Shane.



https://www.ancestry.com/search/collections/pubmembertrees/?name=Shane_Green&birth=1969&death=_new+york-usa_35&birth_x=0-0-0&death_x=0-0-0_1-0&name_x=s_1

There is no one found with the exact name "Shane" but expanding the query to similar names of Shane and the results contain John and Sean as expected.

Sounds Like

"Sounds Like" refers to using phonetic algorithms to determine if two words sound alike. These algorithms vary and none are perfect.


Scrolling down...



Jeffery, Jeffrey, and Geoffrey sound alike.

As an example of when phonetic algorithms fall short, Dwight and Ted are found to sound alike. 


When that happens don't let that discourage you. Use similar names or initials, or try other names that you might feel are relevant. There is no system that can do everything, so use some creative ideas and keep trying if you don't see the results you are looking for.


Initials

Searching by initials take the First and Middle Names and converts that to the initials. The last name is not involved.

John Smith is converted into J Smith.
John David Smith is converted into J D Smith.
J D Smith is converted into J D Smith.




The sorting / ranking for name searches places the full match first, in this case John David. Any results that only match on initials will came after, such as James Donald and Joe Dan.


This OR That

Did you know that if you put a comma between two first names that it means this or that?

John, James when put in the search field with exact selected means the resulting person must be named John or James.






Wild Card Searches

Wild cards is an advanced searching feature. A wild card is a single letter in text that means that this part of the text could be any letter.

An asterisk (*) means that any number of letters can be represented.

Mar* would match Mary, Maria, Martha, Margaret, etc.

A question mark (?) sometimes called a hook means that only one letter can be in that place.
Mar? would match Mary, Mark, Mars, etc.




Notice the results are Mary, Margaret, and Marie.

Change the search query to "Mar?".




Notice that Margaret is not in the result set anymore.

Relation Name Searches

Searching on the names of a family relation can be done for Father, Mother, Spouse, and Child.







Notice that the search on a family member's name may have options for exact or not exact. 


Changing the search so that Maria isn't exact does two things. First it allows for phonetic matches on Maria and second matching on Maria isn't required anymore. The search results should have Maria or a name that sounds like Maria but any result will be returned. Maria should be sorted / ranked first, names that sound like Maria should be next, and then any name after that.






Notice in the results that Maria is first, followed by several Mary's.

In this search example the mother's last name is Jones and it is set to exact. That means the mother must have the name Jones.

You can change the search on the mother's last name as well.






Notice that Smythe comes before Smith and that Mary is a "must have".

Wild card searching works with relation names as well.




Event Searching

Birth, Death, and Marriage are known as events. An event is something that took place somewhere or on some date. Often it is something that took place on a specific date in a specific place.

The following example is going to limit the search with a last name and a death date so that there are a small number of results found so that it is easy to see how changing values affects the results. The birth event is the focus of this example.




In this search the birth must be in Kentucky and if the birth is on the 17th day of the month it should sort / rank to the top.



If exact is selected under the year this means that the date must match. In this case the date must be on the 17th day of the month and it must have occurred in Kentucky.





Changing the search to April with the exact selected under Year will find anyone with a birth in April.




When exact is not selected under Year and the month is set to April then April should sort / rank higher than other results.




https://www.ancestry.com/search/collections/pubmembertrees/?name=_Schlenker&birth=-4_kentucky-usa_20&death=1900&birth_x=_1-0&count=50&death_x=0-0-0&name_x=i_s

The event search can be changed so that the place is not exact but the date is exact. That means the results must have the date.



https://www.ancestry.com/search/collections/pubmembertrees/?name=_Schlenker&birth=-3-10_kentucky-usa_20&birth_x=0-0-0&count=50&name_x=i_s

In this case the results must have the 10th of March and if the event also happened in Kentucky it should be sorted / ranked higher.

At this point is import to review the grouping or clustering of results. If the search actually worked the result must have the 10th of March. The second result is the 19th of March. What is this?

Click on the "View all". There will be a 10th of March in the group / cluster. This indicates that there is uncertainty around this person's birth day.

So, when the results do not look correct, remember that the results are grouped / clustered together and the search will match, it just might take a little investigation to recognize the match.


Conclusion

Member tree searching is not searching for someone's tree, it is searching for someone in any member's tree. It is searching for people.

After the search is performed the resulting persons are grouped / clustered together. This clustering makes it easier to navigate through duplicated persons.

Appendix

Comparison to Previous Search Experience

Below are some screen shots from the previous search experience. The intent is to show how clustering results helps navigate through duplicate persons.

Previous system results for Glen Slinker.




I captured screen shots of the first 22 results, all of which are the same person. I had to go through several pages of results before Marvin or David where found.

Here are the results from the new system that clusters duplicates.




Notice that the result for Erastus Glen Slinker is found in 64 public trees. These duplicates are grouped / clustered together. To see all of the different copies of Erastus Glen Slinker found across 64 trees and 27 private trees just click "View all".





Popular posts from this blog

What makes a "Good Search" on Ancestry

Searching for Records in User Contributed Trees