Radar Trends to Watch: March 2024 – O’Reilly

June 1, 2024

34

January was a dull month, at least in my opinion. Maybe everyone was recovering from their holidays. February was a short month, but it was far from dull. And I’m not even counting the first shipments of Apple Vision. OpenAI has demoed an impressive text-to-vision model called Sora; Google has two very impressive small language models and a model specialized for time series, and they opened Gemini to the public. Outside of AI, there’s a JVM for WebAssembly; you can use it to run applications like Minecraft in the browser. There are some new ultralight web frameworks. And one of the world’s biggest ransomware groups has been shut down.

On the purely weird front: there are a couple of new esoteric languages, one of which solves the problem of naming. We found out that Origami is Turing complete, so start folding. If you want a relief from AI that’s trying to be your pal, try antagonistic AI. And the best of the lot: edible robots.

Learn faster. Dig deeper. See farther.

AI

Mistral has released Mistral Large, their flagship language model, with performance almost equal to GPT-4. It is available only via their API (although a chatbot is in beta). Unlike Mistral’s other models, Mistral Large is not open source.
This is different: Google’s DeepMind has announced Genie, a generative model for building interactive worlds. It’s a video model, but unlike other video models, it’s built for game playing. (Think Mario Brothers, not Star Wars.) There are hints at other applications such as using Genie to develop virtual worlds for training other kinds of AI.
Now that large language models have been given the ability to execute other programs, they can be prompted to attack websites and other online systems.
ZLUDA, a library for running NVIDIA’s proprietary CUDA language on AMD GPUs, was released as an open source project after AMD stopped funding it. (An earlier version targeted Intel GPUs, but that version is no longer supported.)
Researchers in China are exploring whether neural networks can develop their own language for images without the intermediary of human language.
The competitive programming site Topcoder has issued a challenge: develop an AI bot that helps people fill out government forms.
Google has released two small language models, Gemma 2B and Gemma 7B. They claim performance superior to Llama 2 and Mistral. The models are “open,” though not open source. Google has released the weights and, in addition, a responsible generative AI toolkit.
Groq is a chatbot with roughly the performance of GPT-3.5 but has been tuned to give replies that are close to instantaneous.
Building an interactive restaurant menu with AI: whether or not it’s actually useful, this is a great tutorial about building a RAG application with open source AI.
Sora is an impressive new text-to-video model from OpenAI. It is not yet open to the public. OpenAI plans to include C2PA watermarking to identify generated video. They are currently engaged in adversarial testing to make the model less likely to generate biased or harmful content.
A research paper explores antagonistic AI: AI that is designed to be challenging, disagreeable, and confronting. Are there applications for AI that aren’t always earnestly trying to be your friend?
The US Patent and Trademark Office has ruled that only humans can patent inventions, not AI. This guidance is consistent with the Copyright Office’s approach. It doesn’t mean that AI output is not patentable but that there must be significant human input directing the AI.
Google has built a new foundation model for time series. Like language models, and unlike most time series models, TimesFM is pretrained using time series data. It excels at zero-shot predictions.
OpenAI is experimenting with long-term memory in ChatGPT (i.e., memory between conversations). Long-term memory raises a number of privacy issues, in addition to more practical questions like getting a fresh start on a conversation that’s gone wrong.
AI can be an accessory in the death of traditional languages, or a tool for preserving them.
There are many opportunities for using AI to improve accessibility. To use AI effectively, we need to acknowledge the harm that it can do and approach accessibility issues thoughtfully.
Artificial Intelligence cannot be used to deny healthcare. For now, at least.
Google has upgraded Bard to its latest Gemini model (Gemini Advanced). It’s worth trying; it’s on a par with GPT-4V.
Hugging Face has added four new leaderboards for measuring language models’ accuracy in answering questions relevant to businesses (finance, law, etc.), safety and security, freedom from hallucinations, and ability to solve reasoning problems. Unfortunately, the leaderboards only evaluate open source models.
Language models can be trained to be deceptive—specifically, to generate code that includes security vulnerabilities given certain prompts. This behavior can be made persistent and is hard to detect and hard to remove.
Meta has announced that it will label images that have been generated with AI. They discuss a number of methods for identifying AI-generated images, including watermarking, disclosure by the creator, fact-checking, and automated classification of unmarked images.
While AI’s ability to generate music is limited, AI does an extremely good job of mastering human recordings.
TinyLlama is yet another new language model. TinyLlama is small: 1B parameters, but more than that, only requires 550 MB of memory to run. It was designed for small mobile and embedded devices.
The Allen Institute has released OLMo, an open source language model. There are 7B and 1B parameter versions, and it claims performance better than similarly sized models. OLMo is the first completely open model: every step in development and every artifact generated is available.
We have seen surprisingly little discussion of techniques for mitigating AI risks. These ideas for protecting language models from prompt injection and other attacks are far from exhaustive, but they’re a start.
Jeremy Howard has a video on getting started with CUDA programming (NVIDIA GPU programming). It is aimed at Python programmers but no doubt useful for almost anyone.
Eagle 7B is another new large language model. It claims to out-perform all 7B-class models while requiring the least computation power for inference. It is available on HuggingFace. While Eagle appears to be transformer-based, it claims to point the way “beyond transformers.”

Programming

Strada is a new IDE for building applications that use services from different SaaS (software as a service) providers. It makes it easier to work with multiple SaaS APIs simultaneously.
Something new for esoteric language fans: the namingless language. Naming is hard, so this language has only one data structure (so it doesn’t need a name) and only one operator (so it doesn’t need a name, either).
Google is supporting a Rust Foundation effort to improve interoperability between C++ and Rust with the goal of enabling organizations to improve the security of legacy C++ software by migrating to Rust.
Xonsh (however that may be pronounced) is a shell for Unix-like systems that combines Unix shell features with full support for Python.
Is it a coincidence? Two simple web frameworks for Java and Kotlin appear at almost the same time: Spark and Javalin.
Memray is a memory profiler for Python. It can track memory use in libraries written in C or C++, such as NumPy. It’s a great tool for discovering memory leaks, excessive memory allocation, and other problems.
Origami is Turing complete. Fold your way to solutions. Maybe we don’t need quantum computers after all.
sudo on Windows? The times are indeed changing. (Note that Windows sudo and Linux/WSL sudo are not the same.)
Here are some detailed guidelines for designing command line user interfaces for those of us who still believe that command lines are important. They’re the only way to deal effectively with data in bulk.
CheerpJ 3.0 is a Java Virtual Machine for WebAssembly. It is capable of running large Java applications (such as Minecraft) in a browser without plugins. It currently supports Java 8, but the long-term plan is to support the current long-term version (presently Java 21).
Scriptisto is a clever tool that lets you write throwaway scripts in (almost) any commonly used compiled programming language. Add a simple shebang (# !/usr/bin/env/scriptso) line to any program, and it automates compilation and runs the program.
There’s yet another new language, but this one is different. Pkl is an object-oriented language for configuration, not for general-purpose programming.
Scalene is a new profiling tool for Python that accounts for the difference in performance between highly optimized libraries and regular Python code. It can also ask ChatGPT for performance suggestions.
GitLab is planning to use ActivityPub (the protocol behind Mastodon and the fediverse) to connect all their Git repositories into a single network. They will start with social features, but their goal is to enable one instance to open requests for a project hosted on another instance.
Docker Build Cloud is a service that speeds up the process of building Docker images. Claims of a 39x speedup are impressive, but even if Build Cloud doesn’t deliver quite that much, the decrease in build time is still significant.
A study of programming trends associates the use of coding assistants like GitHub Copilot with lower-quality code, increased code churn, more copy/paste code, and less refactoring.

Web

Is it possible to build software with a sense of place? Digital Terroir is a fascinating discussion about what a “sense of place” might mean for digital creations.
htmx is a lightweight JavaScript frontend library for HTML that allows web development without using JavaScript directly; rather than write JavaScript, developers add tags to standard HTML elements. Here is a good comparison of htmx and React.
htmz is a minimalist HTML framework that allows you to dynamically load resources within any portion of an HTML page.
The state of JavaScript bloat in 2024: it’s not pretty.
The Observable Framework is a new static site generator for data-driven interactive web applications. It goes a step beyond notebooks, giving developers all the flexibility of modern web applications. Observable is open source.
Microsoft’s Edge browser appears to import data from the Chrome browser (tabs, stored passwords, and more) without the user’s permission, and even if the importBrowsingData setting is explicitly turned off in the user’s profile.
Arc Max is a browser that incorporates AI for summarization, asking questions of web pages, and other features. Scott Hanselman questions whether this is a good approach.

Security

A new attack against SSH uses the SSH-Snake mapping tool to find private keys. After discovering private keys, it can easily move from one account (and machine) to another.
Law enforcement teams from several countries have arrested key members of the LockBit ransomware group, seized control of its infrastructure and data, and created a free decryption tool for victims. As of February 26, though, LockBit appears to be back.
The European Court of Human Rights has ruled that laws weakening end-to-end encryption or requiring back doors for law enforcement are illegal.
WiFi jamming tools have been used to disable security systems in a string of robberies.
A group of vulnerabilities has been discovered that allows an attacker to escape from a container, at which point they can then access the host operating system directly.
Basic security hygiene is important. An employee accidentally published Mercedes-Benz’s GitHub private key in a public GitHub repository, giving anyone unlimited access to Mercedes’ source archives.
Rowhammer is an attack against a system’s memory: repeated reads and writes cause the memory to change values. A new version of the Unix/Linux sudo command resists rowhammer attacks. It is interesting because it is a software mitigation, not a hardware fix.

Virtual Reality

C-Infinity is, essentially, a standing chair with built-in controllers that is designed to prevent VR-induced nausea.
Brilliant Labs is taking preorders for AI glasses. While there’s little description on the site, the glasses look like a heads-up augmented reality display that superimposes descriptive text on your field of view. They claim compatibility with prescription lenses.
Apple’s Vision Pro is now available. There are many product reviews, but Ben Thompson’s review is comprehensive. He identifies the big problem: apps. Not just VR apps, but AR apps, and developing that new generation of apps may require investments that few companies can afford.

Biology

Several years ago, a Kickstarter project to create a glow-in-the-dark rose failed. Now you can order a glowing petunia online, along with purple tomatoes. Has synthetic biology arrived?
Robots you can eat: Researchers are designing robots, including electronics and actuators, that are entirely edible: honey can possibly act as a semiconductor, gold leaf can be used as wire, and batteries can be made from food materials. There may be applications in medicine.

Energy

An abandoned Finnish copper mine will be repurposed as a giant gravity battery that can store excess energy from renewable sources. It isn’t clear how long the battery can run before “discharging” or what the total energy storage is.

Radar Trends to Watch: March 2024 – O’Reilly

Learn faster. Dig deeper. See farther.

AI

Programming

Web

Security

Virtual Reality

Biology

Energy

Related Articles

Magnesium Protects Against Stroke, Heart Disease and Diabetes

The best yoga poses for mental health

‘Chuck and Fern’ LA Shorts Film Festival: Interview

LEAVE A REPLY Cancel reply

Latest Articles

Magnesium Protects Against Stroke, Heart Disease and Diabetes

The best yoga poses for mental health

‘Chuck and Fern’ LA Shorts Film Festival: Interview

Rethinking Practice Management Fundamentals for Advisors

Silicon Valley shaken as open-source AI models Llama 3.1 and Mistral Large 2 match industry leaders