Data Scientist
Skills
About the role
Job Description: Design and implement entity resolution and record linkage pipelines across multiple data sources
Build and evaluate matching algorithms using classical ML, statistical scoring, and fuzzy string-matching techniques
Develop attribute fusion logic to construct canonical golden records from conflicting multi-source data
Analyze data quality issues, document findings, and propose remediation strategies
Data Source Evaluation
Assess new external data sources (open and commercial) for coverage, quality, and applicability to Customer Master use cases
Apply existing evaluation criteria and contribute additional quality metrics where relevant
Produce structured evaluation reports with recommendations for adoption or rejection
Analytics & Reporting
Profile source datasets and track match quality metrics (precision, recall, F1, coverage)
Build dashboards and analytical summaries to communicate pipeline performance to stakeholders
Document data lineage, matching logic, and provenance for audit and reproducibility
Responsibilities: Design and implement entity resolution and record linkage pipelines across multiple data sources
Build and evaluate matching algorithms using classical ML, statistical scoring, and fuzzy string-matching techniques
Develop attribute fusion logic to construct canonical golden records from conflicting multi-source data
Analyze data quality issues, document findings, and propose remediation strategies
Data Source Evaluation
Assess new external data sources (open and commercial) for coverage, quality, and applicability to Customer Master use cases
Apply existing evaluation criteria and contribute additional quality metrics where relevant
Produce structured evaluation reports with recommendations for adoption or rejection
Analytics & Reporting
Profile source datasets and track match quality metrics (precision, recall, F1, coverage)
Build dashboards and analytical summaries to communicate pipeline performance to stakeholders
Document data lineage, matching logic, and provenance for audit and reproducibility
Qualifications: Python - Pandas, NumPy, scikit-learn, rapidfuzz / jellyfish
SQL - Complex queries, window functions, aggregations; Hadoop/Hive or Presto/Trino
Classical ML & Statistics - Supervised/unsupervised models, probabilistic scoring, clustering, feature engineering
String matching & NLP - Fuzzy matching (Jaro-Winkler, Levenshtein, TF-IDF), text normalization, tokenization
Entity Resolution - Record linkage concepts: blocking, scoring, deduplication, cluster evaluation
Data Quality Assessment - Completeness, consistency, coverage metrics; source profiling
Data Analysis - Exploratory analysis, hypothesis testing, statistical reasoning
Questions about this role
How do I apply to this Data Scientist role at EXL Service?
Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.
What's the typical salary for Data Scientist in India?
Compensation for Data Scientist roles in India varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our Data Scientist hub for India medians across recent openings.
How fast does AI Applyd auto-apply?
Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.
What ATS does EXL Service use?
AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.
Want AI Applyd to auto-apply to roles like this?
We tailor your resume per posting, fill the forms, and track replies for you.