2024.03.25 - ScreenAI: A visual language model for UI and visually-situated language understanding
Ping Xia
Title: 2024.03.25 – ScreenAI: A Visual Language Model for UI and Visually‑Situated Language Understanding
Web’s 35th Birthday & AVIF & 八部金刚 & 重新养自己 & 健康从孩子抓起 & 当下 & 自受用
This Week’s Hot Topics
ScreenAI: A visual language model for UI and visually‑situated language understanding
https://blog.research.google/2024/03/screenai-visual-language-model-for-ui.html
Screen user interfaces (UIs) and infographics—such as charts, diagrams, and tables—play important roles in human communication and human‑machine interaction because they enable rich, interactive experiences. UIs and infographics share design principles and visual language (e.g., icons and layouts), offering an opportunity to build a single model that can understand, reason about, and interact with these interfaces. However, their complexity and varied presentation formats make them a unique modeling challenge. Related:
- AI 手机来了,App 将消亡,前端开发范式变了!
- Generative UI and Outcome‑Oriented Design
- Bringing AI to the Masses with Adam D'Angelo of Quora
- Next.js AI Chatbot 2.0
- How Roblox Helps Developers Create, Scale, and Monetize
- Is the “AI developer” a threat to jobs – or a marketing stunt?
- Introducing Stable Video 3D: Quality Novel View Synthesis and 3D Generation from Single Images
- Logarithm: A logging engine for AI training workflows and services
Marking the Web’s 35th Birthday: An Open Letter
https://webfoundation.org/2024/03/marking-the-webs-35th-birthday-an-open-letter/
The first decade of the web fulfilled that promise—the web was decentralized with a long tail of content and options, it created small, more localized communities, provided individual empowerment, and fostered huge value. Yet in the past decade, instead of embodying these values, the web has instead played a part in eroding them. The consequences are increasingly far‑reaching.
Navigating the Future of Frontend
https://frontendmastery.com/posts/navigating-the-future-of-frontend/
Make sense of modern frontend meta‑frameworks. Connect the dots between fundamental concepts old and new.
Why the Creator of Node.js® Created a New JavaScript Runtime
https://stackoverflow.blog/2024/03/19/why-the-creator-of-node-js-r-created-a-new-javascript-runtime/
Ryan Dahl, creator of Node.js and Deno, tells us about his journey into software development and the creation of Node.js. He explains why he started Deno, a new JavaScript runtime. Ryan also introduces JSR, an alternative to npm, and emphasizes the importance of security in the JavaScript ecosystem. Plus: thoughts on the future of JavaScript, including the role of TypeScript and bridging the gap between server‑side and browser JavaScript.
Introducing Natural Input for WebXR in Apple Vision Pro
https://webkit.org/blog/15162/introducing-natural-input-for-webxr-in-apple-vision-pro/
WebXR now includes a more natural and privacy‑preserving method for interaction—the new transient‑pointer input mode—available for Safari 17.4 in visionOS 1.1. Let’s explore how natural input for WebXR works, and how to leverage it when developing a WebXR experience for Apple Vision Pro.
AVIF Is the Future of Web Images
https://medium.com/@fbrkovic/avif-is-the-future-of-web-images-a0b94d7f483e
AVIF is an image format that leverages the AV1 video codec for image compression. It’s designed to significantly reduce file sizes while maintaining or even improving image quality compared to older formats like JPEG, PNG, and even WebP. AVIF supports a wide range of features, including high dynamic range (HDR), wide color gamut (WCG), and 8K resolution, making it incredibly versatile for all types of web content.
In‑Depth Reading
Type System of the React Compiler
https://www.recompiled.dev/blog/type-system/
If you’re wondering what the React compiler is, I recommend reading our recent update post for background. This post is for anyone curious about the compiler theory behind it. Don’t feel pressured to understand everything in this post in order to use the compiler.
WebSockets vs Server‑Sent Events vs Long‑Polling vs WebRTC vs WebTransport
https://rxdb.info/articles/websockets-sse-polling-webrtc-webtransport.html
This article delves into these technologies, comparing performance, highlighting benefits and limitations, and offering recommendations for various use cases to help developers make informed decisions when building real‑time web applications.
require(esm) in Node.js
https://joyeecheung.github.io/blog/2024/03/18/require-esm-in-node-js/
Recently I landed experimental support for require()-ing synchronous ES modules in Node.js, a feature that has been long overdue. In the pull request I commented on why it didn’t happen sooner. This post expands on that comment.
One Billion Row Challenge in Go – From 95 s to 1.96 s
https://r2p.dev/b/2024-03-18-1brc-go/
The One Billion Row Challenge (1BRC) is simple: develop a program that can read a file with one billion lines, aggregate the information in each line, and print a report with the result.
MVP Developers Are MVPs
https://blog.visionarycto.com/p/mvp-developers-are-mvps
Developers who excel at building prototypes are the MVPs of your team.
How to Start Google
https://paulgraham.com/google.html
All you can know when you start working on a startup is that it seems worth pursuing. You can’t know whether it will become a multi‑billion‑dollar company or fail. So when I say I’m going to tell you how to start Google, I mean I’ll tell you how to get to the point where you can start a company that has as much chance of being “Google” as Google had of being “Google.”
Fresh Finds
Happy Dom: A JavaScript implementation of a web browser without its graphical user interface
React Data Grid: Feature‑rich and customizable data‑grid React component
Microdiff: A fast, zero‑dependency object and array comparison library
Atrament: A small JS library for beautiful drawing and handwriting on the HTML Canvas
[Java 22 / JDK 22: G …] (content truncated)
Originally written by Ping Xia (平侠) and published in Chinese on Web技术周刊 (Web Tech Weekly). Translated and adapted for DriftSeas with permission.
Sources & References
- [1]https://blog.research.google/2024/03/screenai-visual-language-model-for-ui.html
- [2]AI 手机来了,App 将消亡,前端开发范式变了!
- [3]Generative UI and Outcome‑Oriented Design
- [4]Bringing AI to the Masses with Adam D'Angelo of Quora
- [5]Next.js AI Chatbot 2.0
- [6]How Roblox Helps Developers Create, Scale, and Monetize
- [7]Is the “AI developer” a threat to jobs – or a marketing stunt?
- [8]Introducing Stable Video 3D: Quality Novel View Synthesis and 3D Generation from Single Images
- [9]Logarithm: A logging engine for AI training workflows and services
- [10]https://webfoundation.org/2024/03/marking-the-webs-35th-birthday-an-open-letter/
- [11]https://frontendmastery.com/posts/navigating-the-future-of-frontend/
- [12]https://stackoverflow.blog/2024/03/19/why-the-creator-of-node-js-r-created-a-new-javascript-runtime/