Matching is a crucial element of integrating structured and semi-structured data. For instance, when multiple organizations are merged, both parties have to share information to be fused together into a new integrated system.

Matching schemata is the first step in this process. Schema matching aims to find correspondences between concepts describing the meaning of data sources. Human-in-the-loop data integration, and specifically in matching, has been recently challenged by the big data era. With richer data source variety, bigger volumes, and faster velocity, accessing and making use of humans’ input has become much harder.

Acknowledging cognitive awareness in human matching, InCognitoMatch (Introducing Cognitive Biases to Crowdsourced Matching) is a handy tool to validate, annotate, and correct correspondences using the crowd.


The following information provides a short overview and instructions necessary for participating in the task.

The task begins with providing personal information (e.g., ID and nickname) and a (short) intelligence test.
Subsequently, a subset of matching correspondences will be introduced. At the end, you will be able to analyze your performance using a dedicated dashboard.

In the main part of the session, several element pairs will be introduced. Each pair consists of two terms, one from a candidate schema and one from target schema. To accompany each pair, additional properties will be introduced including term names and types, instance examples, schemata hierarchy, algorithmic matching result and a majority decision of the other participants.

Your task is to provide feedback, stating your confidence level (on a [0,100] scale) about whether a pair is a match (the two terms correspond) or no match.
Confidence level higher than 50 indicates preference for match while value lower than 50 indicates preference for no match. Please note that you will not be able to choose confidence level of 50.


Task Screen Example

The figure above illustrates the matching task using a matching question regarding the pair contactName (Term A) and contactName (Term B).

The two sides of screen show information about the two terms. Each term name appears clearly on top and marked in red in the center of the screen.
Following the name, the data types and instance examples are introduced. In this example, the types of both contactName (Term A) and contactName (Term B) are “str”. An example instance of contactName (Term A) is "Dr. Gilboa Barak" and contactName (Term B) instance is "Ms Einat Hadar".
The center part of the screen displays the schemata hierarchy. The sign indicates the term's level within the hierarchy. For example, the term contactName (Term A) is placed at the third level.
The color legend is as follows:
Green - Parents of the term.
Red - The term.
Blue - Siblings of the term.
The direct parent of contactName (Term A) is Contact and a sibling term is, for example, telephone.
As demonstrated below, the algorithm matching result is 100% and the majority decision of the crowd is 66.67%.
Finally, the decision should be made by selecting the confidence level and then press on the SUBMIT button.


The task consists of 30 element pairs and may last approximately 30 minutes.
By proceeding to the task, you approve having your personal information and task results stored in the system. This data will be used for academic research only.

Click on the TASK button.

Thank You!

Comments sent Successfully.

Please copy the following code and paste it into the submission box at the Moodle:

Task Summary

No data is available to show.

Please Wait..

Logical Riddles

The following intelligence test contains 3 psychometric riddles. The estimated duration for this part is 2 minutes.
Answer all the riddles thoroughly and then proceed to the matching task by clicking on the button CONTINUE. Enter numerical answers solely.

Question Answer
If you flip a valid coin three times, what are the odds of it to fall of tails al least once?
A frog fell into a 30m’ deep hole. Every day she climbs 3m’ but then falls back down 2m’. How many days will it take for the frog to get out of the hole?
A bakery bakes 400 cookies every day. When the manager does not look, 20% of the cookies are thrown away. How many extra cookies does the manager has to back to make it up?
Bob stands in a queue. He counts the people ahead of him and finds out that he is in the 38th place. Then he counts the people behind him and finds out he is in the 56th place from last place. How many people are in the queue?
The ants are walking in a line. In what place will the ant that passed the one in the second place be?
Apple mash contains 99% water and 1% apples. I left 100kg of apple mash in the sun and a portion of the water evaporated. Now the apple mash consist only 98% water. What is the weight of the apple mash?
An exam for a disease with a 1/1000 frequency gives a false-positive 5% of the times. What are the odds for an ill person to get a positive result?

Term A - FirstName
Column Type
Instance Example
A, B
Term A Hierarchy

Term B Hierarchy

Term B - FirstName
Column Type
Instance Example
A, B

Algorithm Similarity Result 60%
Majority Decision 80% Match
Your Decision

Add User

Click on Continue to Start the TASK


No data is available to show.
Please select again users and groups to show data about by the filter option.