The affiliations data set is one of the key parts of the Bitergia Analytics Platform. Here you will find the information to modify the information about specific contributors or organizations.
The huge variety of tools used by software development communities makes really difficult any analysis. It is not enough to measure the activity in the code, issues, Q&A forums, mailing lists, etc. Without having a way to connect the activity of the contributors in the different tools (data sources as we call them) it will not be possible to offer high quality metrics. Just as an example, the three metrics below are only possible by having the affiliations data set:
- Total number of contributors of a community. If your team members are using GitHub and Jira, are you sure you are not counting them twice?
- Impact of an organization/company in the project. To get this metric you need the link between community members and organizations.
- Retention rate of your contributors. How can you tell that someone is new to the community if you do not have a way to identify people using different accounts?
Identities are collected and stored by the Bitergia Analytics Platform. In case you want to improve the quality of the data set you will need to use the affiliations editor (SortingHat). Find below the typical actions our customers perform to achieve it.
The most common issue is to have some of the organizations under-represented. Follow the steps below to fix this:
- Go to the
- Select the organization
Unknownin any of the two pie charts. Organizations are shown in the inner circle.
- Identify the domains that should be associated to an organization.
- Open the identities editor. It will be available at
- Go to the
Organizationstable on the right side and search for the one you are missing.
- If you do not find it, add that organization to the database.
- If you find it, extend the information available for that organization.
- When the domain is added to the organization a process will refresh the metrics you see on the dashboards. This takes a couple of hours.
Adding a new organization#
An organization can be added via SortingHat which will be available at
ADD button on the top right corner of the
Organizations table to add an
To add auto affiliation you have to add the base domain of the organization. Marking the
top domain also will also auto affiliate using its subdomains.
Removing an organization#
Organizations can be removed via SortingHat. Note that all the enrollments of contributors
affiliated to the deleted organization will be lost.
Go to the
Organizations table on the main page, and use the
Search box to look for an
organization. Click on the button with the three dots on the right side of the
organization and then select
Delete organization on the dropdown menu.
Adding an e-mail domain to an organization#
The platform can map email domains to organizations. By doing that every identity with any of those mapped domains will be associated to the proper organization.
Organizations table, use the
Search box to look for an organization.
For each organization you want to update, click the three dot button and select the button
View/edit domins to add/remove the organization domains.
Enrolling a person to an organization#
Search for the user using the
Search box on the
Individuals table, and click on the
button with the chevron icon next to the user to expand it. Then, click the
Organizations. On the pop up you can link the profile to an organization and
set the start and end dates.
Alternatively, you can use drag and drop to enroll a person to an organization.
Search for the contributor on the
Individuals table and for the organization on the
Organizations table. Grab the user with your mouse and release it over the organization
or viceversa. After that, a pop up will let you set the start and end dates.
Un-enrolling a person from an organization#
Expand the contributor's information and click on the button with the trashcan icon next
to the organization you want to remove. To un-enroll the person from all the
organizations to which they are affiliated, click
There is an option to mark an identity as bot. This is available in the
section in SortingHat. This helps to filter out automated activity in the dashboard while
keeping such information in the database.
Search the identity that you want to mark as bot and click on the button with a robot icon next to the name.
Preventing that affiliations removed from a contributor are restored#
If you changed the affiliation for some contributors and you discover the changes were restored, you can try the following. Instead of removing an organization automatically added to the contributor, change its timeframe. For instance, you can set the enrollment from 1900-01-01 to 1900-01-02, thus there will not be any data containing that enrollment.
How dates work#
Dates are by default set to the beginning of the day, i.e.
Thus, we should setting the periods as:
Because end date for
Organization A will be set under the hood as
(not inclusive) and start date for
Organization B will be set as
2019-10-01 will be the first day
that profile is enrolled to
Organization B and
2019-09-30 will be the last day of
that profile enrolled to
It could be summarized as the following condition, where
date is the date to check
and decide the enrollment:
start_date <= date < end_date.
Matching of identities#
Our support team will study the data sources you are tracking and set the best algorithm to identify the different accounts used by your community members. The goal is avoid duplicated identities but also having the wrong ones unified.
Types of matching#
There are three types of matching that will allow the system to unify identities:
name: same full name.
username: same username of any source.
Contributor activity is labeled with a single company when she/he is enrolled with more companies#
When two enrollments overlap in time, the system will use one of them to update the enriched information on the dashboard. That means that only one of them will be visible. In case you detect identities with overlapping dates edit them so the profile is not enrolled to more than one company at the same time.
Changes are not visible in the metrics dashboard#
Changes performed over the identities database need to be synchronized with the metrics data set.
For each data source (e.g., git, mboxes) there are 3 main phases: collection, enrichment and identities refresh. The last one takes care of synchronizing the data stored in OpenSearch with the one available in SortingHat. The time to perform the 3 steps depends on the amount of data available in the data source, however it generally takes less than 3 hours.