Skip to content

Data Scientist

EXL Service

Bengaluru, INonsitePosted Jun 1, 2026

Skills

scikitlearnclusteringhypothesishadoopprestopandaspythontrinonumpynaturallanguageprocessingterraformml

About the role

Job Description: Design and implement entity resolution and record linkage pipelines across multiple data sources

Build and evaluate matching algorithms using classical ML, statistical scoring, and fuzzy string-matching techniques

Develop attribute fusion logic to construct canonical golden records from conflicting multi-source data

Analyze data quality issues, document findings, and propose remediation strategies

Data Source Evaluation

Assess new external data sources (open and commercial) for coverage, quality, and applicability to Customer Master use cases

Apply existing evaluation criteria and contribute additional quality metrics where relevant

Produce structured evaluation reports with recommendations for adoption or rejection

Analytics & Reporting

Profile source datasets and track match quality metrics (precision, recall, F1, coverage)

Build dashboards and analytical summaries to communicate pipeline performance to stakeholders

Document data lineage, matching logic, and provenance for audit and reproducibility

Responsibilities: Design and implement entity resolution and record linkage pipelines across multiple data sources

Build and evaluate matching algorithms using classical ML, statistical scoring, and fuzzy string-matching techniques

Develop attribute fusion logic to construct canonical golden records from conflicting multi-source data

Analyze data quality issues, document findings, and propose remediation strategies

Data Source Evaluation

Assess new external data sources (open and commercial) for coverage, quality, and applicability to Customer Master use cases

Apply existing evaluation criteria and contribute additional quality metrics where relevant

Produce structured evaluation reports with recommendations for adoption or rejection

Analytics & Reporting

Profile source datasets and track match quality metrics (precision, recall, F1, coverage)

Build dashboards and analytical summaries to communicate pipeline performance to stakeholders

Document data lineage, matching logic, and provenance for audit and reproducibility

Qualifications: Python - Pandas, NumPy, scikit-learn, rapidfuzz / jellyfish

SQL - Complex queries, window functions, aggregations; Hadoop/Hive or Presto/Trino

Classical ML & Statistics - Supervised/unsupervised models, probabilistic scoring, clustering, feature engineering

String matching & NLP - Fuzzy matching (Jaro-Winkler, Levenshtein, TF-IDF), text normalization, tokenization

Entity Resolution - Record linkage concepts: blocking, scoring, deduplication, cluster evaluation

Data Quality Assessment - Completeness, consistency, coverage metrics; source profiling

Data Analysis - Exploratory analysis, hypothesis testing, statistical reasoning

Questions about this role

  • How do I apply to this Data Scientist role at EXL Service?

    Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.

  • What's the typical salary for Data Scientist in India?

    Compensation for Data Scientist roles in India varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our Data Scientist hub for India medians across recent openings.

  • How fast does AI Applyd auto-apply?

    Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.

  • What ATS does EXL Service use?

    AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.

Want AI Applyd to auto-apply to roles like this?

We tailor your resume per posting, fill the forms, and track replies for you.