Sunday, December 13, 2020

Lavapipe

The lavapipe layer is a gallium frontend. It takes the Vulkan API and roughly translates it into the gallium API.

Vulkan is a lowlevel API, it allows the user to allocate memory, create resources, record command buffers amongst other things. When a hw vulkan driver is recording a command buffer, it is putting hw specific commands into it that will be run directly on the GPU. These command buffers are submitted to queues when the app wants to execute them.

Gallium is a context level API, i.e. like OpenGL/D3D10. The user has to create resources and contexts and the driver internally manages command buffers etc. The driver controls internal flushing and queuing of command buffers.

In order to bridge the gap, the lavapipe layer abstracts the gallium context into a separate thread of execution. When recording a vulkan command buffer it creates a CPU side command buffer containing an encoding of the Vulkan API. It passes that recorded CPU command buffer to the thread on queue submission. The thread then creates a gallium context, and replays the whole CPU recorded command buffer into the context, one command at a time.

Software rasterizers are a very different proposition from an overhead point of view than real hardware. CPU rasterization is pretty heavy on the CPU load, so nearly always 90% of your CPU time will be in the rasterizer and fragment shader. Having some minor CPU overheads around command submission and queuing isn't going to matter in the overall profile of the user application. CPU rasterization is already slow, the Vulkan->gallium translation overhead isn't going to be the reason for making it much slower.

For real HW drivers which are meant to record their own command buffers in the GPU domain and submit them direct to the hw, adding in a CPU layer that just copies the command buffer data is a massive overhead and one that can't easily be removed from the lavapipe layer.

The lavapipe execution context is also pretty horrible, it has to connect all the state pieces like shaders etc to the gallium context, and disconnect them all at the end of each command buffer. There is only one command submission queue, one context to be used. A lot of hardware exposes more queues etc that this will never model.

Pipeline barriers in Vulkan are essential to efficient driver hw usage. They are one of the most difficult to understand and hard to get right pieces of writing a vulkan driver. For a software rasterizer they are also mostly unneeded. When I get a barrier I just completely hardflush the gallium context because I know the sw driver behind it. For a real hardware driver this would be a horrible solution. You spend a lot of time trying to make anything optimal here.

Source: vallium-software-swrast-vulkan-layer-faq

Saturday, May 12, 2012

Linux 3.4-rc7

This is almost certainly the last -rc in this series - things really
have calmed down, and I even considered just cutting 3.4 this weekend,
but felt that another week wouldn't hurt.

The appended shortlog gives a good overview - it's mostly random tiny
fixes for very small specific issues. The biggest commit (and the one
that might affect the most people) is likely the Nouveau i2c change,
and that one is really just a revert. It changes nouveau back to use
the generic i2c-algo-bit routines - the problem they had had been
fixed in the meantime, and the specialized i2c nouveau routines had
issues of their own.

The rest is mainly small changes in various areas: drivers
(networking, drm, scsi, sound and md) arch updates (arm, powerpc and
x86) and random other areas - core networking, a compat fix, stuff
like that. No scary changes.

So go forth and test. And don't send me any pull requests unless they
contain *only* regressions or fixes for really nasty bugs. No more of
these silly compiler warning fixes etc any more.

Linus

Saturday, April 7, 2012

Linux 3.4-rc2

Another week, another -rc. It actually *felt* pretty calm, but
according to the numbers it's a fairly average -rc2, maybe it even has
slightly more changes than usual.

That said, there doesn't seem to be a lot of scary stuff. A fair
amount of the changes are some (hopefully largely final) fixups for
the header file changes, and then there are the three pull requests
mentioned in -the rc1 announcement: HSI (high-speed serial interface)
framework, dma-buf prime, and the DMA mapping stuff. Those three all
got several people piping up and saying "yes, please pull". Pohmelfs
didn't get merged, for the simple reason that nobody actually asked
for it.

Apart from the header file fixups and the three delayed pulls, there's
the usual fixes. I'm going to be stricter about pulls from here on
out, there was a lot of "noise", not just pure fixes. Some of it as
induced by me: a series of selinux patches by Eric Paris to make
selinux wrapper stack usage much better.

Bulk of changes in some architecture files (arm, tile, powerpc, x86)
and in drivers (especially a networking but also regulator, drm and
mmc). And some power management updates.

Shortlog is appended. And I'm hoping -rc3 will already have a
noticeably shorter shortlog.

Linus

Monday, June 20, 2011

What is Linux?

The Linux is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software.

The Linux kernel is released under the GNU General Public License version 2 (GPLv2)(plus some firmware images with various non-free licenses), and is developed by contributors worldwide. Day-to-day development discussions take place on the Linux kernel mailing list.

The Linux kernel was initially conceived and created by Finnish computer science student[8] Linus Torvalds in 1991. Linux rapidly accumulated developers and users who adapted code from other free software projects for use with the new operating system. The Linux kernel has received contributions from thousands of programmers. Many Linux distributions have been released based upon the Linux kernel.

Saturday, September 18, 2010

Matthew Garrett: USB runtime power management

Matthew Garret have just committed some patches to the rawhide (not F14) tree that re-enable USB autosuspend on some devices. This set includes a workaround in the bluetooth input code that should handle the case where people were seeing their input devices become laggy when autosuspend was enabled, but there's still some chance that other bluetooth devices will behave slightly oddly. If that's the case then try:

echo on >/sys/class/bluetooth/hci0/device/power/control

and see if it improves things. If so then please file a bug and include information about the device you're trying to connect to.

Complete story.

Tuesday, August 31, 2010

Linux Programming Interface : Michael Kerrisk

Today, Michael Kerrisk post about his new book at his blog.

You can see the and download the resource from the book website.

Friday, June 11, 2010

[GIT PULL] Btrfs updates for 2.6.35

-----Original Message-----
From: Chris Mason
Date: Fri, 11 Jun 2010 15:37:31
To: Linus Torvalds; linux-kernel; linux-btrfs
Subject: [GIT PULL] Btrfs updates for 2.6.35

Hello everyone,

The master branch of the btrfs-unstable tree is a collection of fixes
and cleanups, including two btrfs regressions from rc1:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git master

One is an freeing blocks on an FS converted from ext34 to btrfs,
and the other is a fallocate fix.

The rest are the usual small bug fixes.

Dan Carpenter (11) commits (+24/-17):
Btrfs: handle error returns from btrfs_lookup_dir_item() (+2/-0)
Btrfs: btrfs_read_fs_root_no_name() returns ERR_PTRs (+4/-0)
Btrfs: unwind after btrfs_start_transaction() errors (+1/-1)
Btrfs: remove unneeded null check in btrfs_rename() (+1/-3)
Btrfs: The file argument for fsync() is never null (+1/-1)
Btrfs: handle ERR_PTR from posix_acl_from_xattr() (+2/-0)
Btrfs: btrfs_lookup_dir_item() can return ERR_PTR (+1/-1)
Btrfs: uninitialized data is check_path_shared() (+1/-1)
Btrfs: handle kzalloc() failure in open_ctree() (+5/-2)
Btrfs: silence sparse warnings in ioctl.c (+4/-6)
Btrfs: btrfs_iget() returns ERR_PTR (+2/-2)

Zheng Yan (2) commits (+6/-4):
Btrfs: Fix BUG_ON for fs converted from extN (+2/-1)
Btrfs: Fix null dereference in relocation.c (+4/-3)

Liu Bo (2) commits (+14/-4):
Btrfs: Add error check for add_to_page_cache_lru (+13/-3)
Btrfs: fix break in btrfs_insert_some_items() (+1/-1)

Julia Lawall (2) commits (+9/-17):
Btrfs: Use memdup_user (+6/-14)
Btrfs: Use ERR_CAST (+3/-3)

Shi Weihua (2) commits (+6/-0):
Btrfs: prohibit a operation of changing acl's mask when noacl mount option used (+3/-0)
Btrfs: should add a permission check for setfacl (+3/-0)

Miao Xie (2) commits (+9/-1):
Btrfs: fix loop device on top of btrfs (+1/-0)
Btrfs: fix remap_file_pages error (+8/-1)

Sage Weil (1) commits (+0/-3):
Btrfs: avoid BUG when dropping root and reference in same transaction

Andi Kleen (1) commits (+2/-94):
BTRFS: Clean up unused variables -- nonbugs

Josef Bacik (1) commits (+1/-1):
Btrfs: fix fallocate regression

Prarit Bhargava (1) commits (+1/-1):
Btrfs: Fix warning in tree_search()

Total: (25) commits (+72/-142)

fs/btrfs/acl.c | 8 ++++++++
fs/btrfs/compression.c | 18 +++++++++++++-----
fs/btrfs/ctree.c | 20 +-------------------
fs/btrfs/disk-io.c | 22 +++++++++-------------
fs/btrfs/extent-tree.c | 5 ++---
fs/btrfs/extent_io.c | 9 ---------
fs/btrfs/extent_map.c | 4 ++--
fs/btrfs/file.c | 12 ++++++++++--
fs/btrfs/inode.c | 22 +++-------------------
fs/btrfs/ioctl.c | 36 ++++++++++++------------------------
fs/btrfs/ordered-data.c | 4 +---
fs/btrfs/relocation.c | 7 ++++---
fs/btrfs/root-tree.c | 5 -----
fs/btrfs/super.c | 14 +++++++-------
fs/btrfs/tree-defrag.c | 2 --
fs/btrfs/tree-log.c | 15 ---------------
fs/btrfs/volumes.c | 4 ----
fs/btrfs/xattr.c | 2 --
fs/btrfs/zlib.c | 5 -----
19 files changed, 72 insertions(+), 142 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/