Skip to content

ML Engineer II (Inference Platform)

Beatdapp

Vancouver, CAonsitePosted May 29, 2026

At a glance

Highlights

  • Advanced streaming integrity technology
  • ML inference at scale
  • Fast-moving roadmaps
  • Engineering judgment emphasis

Why this role might suit you

The role offers exposure to cutting-edge audio ML inference, multi-cloud GPU orchestration, and fast-moving product roadmaps, providing engineers with challenging architectural problems and opportunities to influence scalable platform design.

Skills

pythongorustc++dockerdockerfileterraformkubernetesawsgcpmlflowsqlbigquerypostgresairflowobservabilitygpu-optimizationmicro-batchingconcurrencyrequest-queueingimage-signingci-cd

About the role

About Beatdapp

Beatdapp is a company delivering the most advanced streaming integrity technology in the world. One of our ventures is building machine learning inference systems for audio at scale. The work spans music, podcasts, and speech, with a particular focus on AI-generated sound. As generative audio gets cheaper and faster, the platforms we serve need accurate signal about what's real in their content, and they depend on us to provide it.

About the role

This role sits at the intersection of ML engineering, platform / infrastructure work, and inference systems. You will bridge the gap between raw audio and the clean signals our detection models depend on. You will partner with data scientists on bringing those models into production, carrying the production lens (latency, cost, customer-facing edges) into design conversations early so trade-offs are made together rather than discovered late.

In practice, the work cuts across the GPU-bound inference containers, the multi-cloud infrastructure that runs them, the API layer in front, the data and observability around them, and the CI that ships it all. The architectural challenge running through all of it is containing drift and scaling with minimal code.

Roadmaps here are weeks, not quarters, and your scope grows and shifts as the team and systems do. So we are hiring for engineering judgment first: a strong feel for clean, scalable design, an eye for code hygiene, and the courage to advocate for a better approach rather than default to consensus.

What you'll do

Container Engineering and Orchestration: Build, tune, and ship our inference containers. Building and maintaining Dockerfile and dependencies, image size and cold-starts, GPU access patterns, the multi-cloud orchestration shape that runs it (ECS, Cloud Run, GKE, EKS), test coverage for the container surface, and the storage abstraction it depends on.

In-Container Performance and Resource Optimization: Squeeze more out of each GPU instance: concurrency tuning, VRAM accounting, request timeouts and queueing, rate limiting, multi-GPU distribution on instances that have more than one, and the right-sizing decisions that follow.

Scale and Stress Testing: Build and run scale and stress scenarios across mock deployments that mirror real customer environments. Characterize the latency-vs-throughput curves, find the breaking points, and turn the results into autoscaling and instance-sizing decisions.

Cloud Infrastructure: Operate the Terraform stack across multiple clouds (GCP, AWS). Networking, identity, GPU nodes, autoscaling, per-tenant account configurations.

API Layer: Build and extend the customer-facing API layer that fronts the inference service: client authentication, rate limiting, per-client data isolation, and request metering.

Maintain Data Pipelines: Maintain and extend the data orchestration pipelines that feed model evaluation, customer reporting, and operational dashboards.

Observability: Build and tune the metrics, dashboards, logging, and alarms across three layers: the inference service, the running instances, and the deployed models themselves.

What we're looking for

Related STEM degree (BSc, MSc, or higher) and 3+ years of work experience in platform / infra / backend / ML / applied-ML / data engineering.

Strong engineering skills: The ability to write clean, scalable, production-grade code in Python or more performance-oriented language(s) (Go, Rust, C++).

Architectural fluency across data stores, distributed systems, caching, and data transfer protocols.

Data engineering skills: Comfort building data processing pipelines and using SQL (Airflow, BigQuery, Postgres).

Deep cloud infrastructure and networking experience across one or more platforms (GCP, AWS).

ML platform tooling: comfort with MLflow or similar tooling and model lifecycle processes (model versioning, artifact storage, promotion workflows).

Terraform: write and modify modules, understand state and backends, IaC over console.

CI/CD discipline: cloud OIDC, image signing, pinned versions, an instinct for cheap and reproducible CI.

Observability instincts: comfortable instrumenting across hardware, application, and model layers (latency, throughput, score distributions, drift). You know which metric to look at first when latency spikes.

Inference performance tuning: comfort with the levers of a high-throughput GPU service (micro-batching, concurrency, request queueing, in-container resource management).

Strong written communication: runbooks, design docs, PR descriptions, postmortems, and ticket hygiene (Jira).

Nice-to-haves

Not required, but a strong plus if you bring hands-on work experience with at least one of the following:

Audio or media systems

Signal processing

Speech detection (synthetic / artificial)

Computer vision

GPU work beyond running inference (CUDA, kernels, drivers, cluster operations)

Streaming systems (Kafka, Pub/Sub, Kinesis, or similar)

Questions about this role

  • How do I apply to this ML Engineer II (Inference Platform) role at Beatdapp?

    Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.

  • What's the typical salary for Machine Learning Engineer in Canada?

    Compensation for Machine Learning Engineer roles in Canada varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our Machine Learning Engineer hub for Canada medians across recent openings.

  • How fast does AI Applyd auto-apply?

    Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.

  • What ATS does Beatdapp use?

    AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.

Want AI Applyd to auto-apply to roles like this?

We tailor your resume per posting, fill the forms, and track replies for you.