Liberté, égalité, ouverture. I’m doing my part for the cause by hosting OpenXData on April 29 - a free, virtual event with 30+ talks on open data infra, from table formats to query engines to feature serving. Join me and get sharper than a guillotine at cutting latency. |
HOT TAKE Ctrl+Trust"I need to understand the code" is the new bottleneck. What matters more: Shipping or Knowing? |
LAST WEEK'S TAKE A Model Interface“We upgraded the model and nothing changed” has played out enough times that interface wins comfortably.  |
PRESENTED BY CLERIC AI Heading to Google Cloud Next in Vegas next week? Some of the best conversations happen between sessions. Cleric AI is bringing those conversations to the table instead - a small dinner for engineering leaders at José Andrés’ renowned restaurant, Zaytinya, on Tuesday, April 21. A chance to compare notes on how AI is changing the way teams build and ship software, with people running systems in production. Small group, limited seats. Request an invite |
HIDDEN GEMS Curated finds to help you stay aheadAutonomously completing 66 engineering tickets via agents using structured workflows, task decomposition, and iterative feedback for real-world software development. A standardized method for scaling autonomous agents using OpenClaw architecture to demonstrate separating high-level logic to ensure scalable, reproducible task execution. LLM workflows orchestrated with tmux panes, enabling parallel tool use, inspection, and control with transparent, reproducible, debuggable execution. Detecting humans vs machines in call audio within milliseconds to route calls efficiently, balancing latency, accuracy, and scale at high volume. |
💡Job of the week Senior Machine Learning Platform Engineer // Zip Co Limited (US Remote) Senior ML platform role focused on building and scaling production systems across Databricks and Spark. Covers full lifecycle from feature pipelines to model serving, with emphasis on reliability, observability, and distributed data performance in production environments. Responsibilities - Build batch and streaming feature pipelines using PySpark and Spark SQL
- Design and operate offline and online feature store patterns
- Define MLflow registry standards and model promotion workflows
- Deploy, monitor, and scale model serving endpoints on Databricks
Requirements - Strong PySpark and Spark SQL experience with distributed data systems
- Hands-on MLflow, feature stores, and production model serving experience
- Experience implementing CI/CD pipelines for ML workflows in production
- Experience with Databricks, Delta Lake, and Azure-based data platforms
|
Find more roles on our new jobs board - and if you want to post a role, get in touch. |
MLOPS COMMUNITY The Modern Software EngineerAI coding agents can finish a task before you’ve finished framing it, but that speed hides a harder problem: how much of the work can you trust, verify, or even understand? This discussion looks past the demo magic and into the practical bottlenecks teams are hitting as agents move from autocomplete to semi-autonomous collaborators. - Validation is the real constraint. Agents can generate code fast, but tests, checks, and review harnesses still decide what is safe to ship.
- Team structure is starting to shift. Product, engineering, and design roles are bleeding into each other as more people can inspect code, propose changes, and unblock themselves.
- The skill gap is changing shape. Clear articulation, planning, and delegation matter more when engineers are effectively managing agents instead of writing every step by hand.
The hard part is no longer getting code written but knowing what to trust, what to verify, and where humans still need to hold the line. Video || Spotify || Apple
|
How We Cut LLM Latency 70% With TensorRT in ProductionCut latency 70% or burn cash on idle GPUs - running LLMs in production is a constant trade-off. This breakdown shows what it takes to move from demos to real systems, where cost, throughput, and architecture decisions matter more than model choice. - Cost isn’t fixed - it’s shaped by architecture. Bigger GPUs can be cheaper overall if higher throughput reduces total runtime.
- Cold starts and scaling are the hidden bottlenecks. Preloading models, faster storage, and scheduled or dynamic scaling cut minutes off spin-up times.
- Optimization compounds. Techniques like TensorRT, batching, and KV cache usage unlock major gains without changing models.
The real advantage comes from tuning the system around your workload, not chasing the next model release. Video || Spotify || Apple
|
Context Graphs And Their Implementation: The Missing Layer Between Human Judgment and Machine AgencyIf context graphs are meant to become the memory layer for agents and organizations, the hard part is not drawing nodes and edges. It is capturing why a decision happened, who approved it, what constraints shaped it, and whether it later proved right. This piece argues that context graphs only become useful when they can survive real company messiness like review workflows, legal sensitivity, local jargon, and scale. - Decision traces need governance. If humans do not review, correct, and approve them, the graph risks becoming a polished record of bad reasoning.
- Reasons need dual encoding. Short natural-language explanations plus structured tags give humans something readable and agents something stable to reason over.
- The data layer has to handle reality. Time-aware context, multimodal artifacts, integrations, retention rules, and fast writes are all part of making this work outside a demo.
The real blocker is not the graph itself but whether an organization can turn judgment into something structured, reviewable, and worth trusting later. Read the blog
|
|
MEME OF THE WEEK  |
ML CONFESSIONS The Twilight Time ZoneI spent three weeks building what I was convinced was a breakthrough feature for our recommendation model. Pulled in user session duration, did some clever windowing, engineered a rolling average that captured engagement patterns nobody else had tried. Offline metrics jumped. I wrote up the results, put together slides, and booked time with the team lead to talk about promoting it to the next A/B test. She looked at it for about ten minutes. Asked me what timezone the session timestamps were in. I said UTC. She pulled up the ingestion pipeline docs and showed me they were in the user's local timezone, mixed across regions, with no normalization. My rolling averages were blending Tuesday morning in Tokyo with Monday evening in Chicago. The "signal" was just timezone noise creating artificial variance that happened to correlate with the label in the test set. She was nice about it. That almost made it worse. I still think about it every time I touch a timestamp column. Share your confession here. |
HOW WE CAN HELP Making the hard stuff simplerWorking on something tricky or planning ahead? Here’s how we can help - just hit reply: - Custom workshops tailored to your company’s needs
- Hiring? I know some quality folks looking for a new adventure
- Want to connect with someone tackling similar problems? I can introduce you
Thanks for reading, catch you next time! |
Interested in partnering with us? Get in touch: Thanks for reading. See you in Slack, YouTube, and podcast land. Oh yeah, and we are also on X and LinkedIn. |
|
|