Researchers using a benchmark called the Remote Labor Index (RLI) tested several AI models on real remote freelance projects and found the systems completed only a tiny fraction of work at an acceptable quality level, with the best model reaching an automation rate of just 2.5%. The tasks had