One thing that's been bothering me lately: benchmark performance often tells me almost nothing about whether a workflow will survive production usage.[D]
Novel Problems in VLA [R]
Live Human Detector on Outbound Phone Calls [R]
Do VLMs in production still use fixed-patch ViTs for their vision capabilities? [D]
Can liveness detection models generalise to synthetic media generation techniques they were never trained on? [D]