commit 62beb979b4e43c361db54fbf3084f876fd2c11da Author: Andrew Deason Date: Mon Jul 16 16:53:34 2018 -0500 afs: Skip bulkstat if stat cache looks full Currently, afs_lookup() will try to prefetch dir entries for normal dirs via bulkstat whenever multiple pids are reading that dir. However, if we already have a lot of vcaches, ShakeLooseVCaches may be struggling to limit the vcaches we already have. Entering afs_DoBulkStat can make this worse, since we grab afs_xvcache repeatedly, we may kick out other vcaches, and we'll possibly create 30 new vcaches that may not even be used before they're evicted. To try to avoid this, skip running afs_DoBulkStat if it looks like the stat cache is really full. Reviewed-on: https://gerrit.openafs.org/13256 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 9ff45e73cf3d91d12f09e108e1267e37ae842c87) Change-Id: I1b84ab3bb918252e8db5e4379730a517181bc9d8 Reviewed-on: https://gerrit.openafs.org/14400 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 28794011085d35b293d5e829adadb372b2a2b3fd Author: Andrew Deason Date: Mon Jul 16 16:44:14 2018 -0500 afs: Log warning when we detect too many vcaches Currently, afs_ShakeLooseVCaches has a kind of warning that is logged when we fail to free up any vcaches. This information can be useful to know, since it may be a sign that users are trying to access way more files than our configured vcache limit, hindering performance as we constantly try to evict and re-create vcaches for files. However, the current warning is not clear at all to non-expert users, and it can only occur for non-dynamic vcaches (which is uncommon these days). To improve this, try to make a general determination if it looks like the stat cache is "stressed", and log a message if so after afs_ShakeLooseVCaches runs (for all platforms, regardless of dynamic vcaches). Also try to make the message a little more user-friendly, and only log it (at most) once per 4 hours. Determining whether the stat cache looks stressed or not is difficult and arguably subjective (especially for dynamic vcaches). This commit draws a few arbitrary lines in the sand to make the decision, so at least something will be logged in the cases where users are constantly accessing way more files than our configured vcache limit. Reviewed-on: https://gerrit.openafs.org/13255 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 0532f917f29bdb44f4933f9c8a6c05c7fecc6bbb) Change-Id: Ic7260f276e00f3e34541207955df841d4ed27844 Reviewed-on: https://gerrit.openafs.org/14399 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 47ab46e1e9fe56b5bf8147eab0b652e74078cbe7 Author: Andrew Deason Date: Mon Jul 16 16:08:13 2018 -0500 afs: Split out bulkstat conditions into a function Our current if() statement for determining whether we should run afs_DoBulkStat to prefetch dir entries is a bit large, and grows over time. Split this logic out into a separate function to make it easier to maintain, and add some comments to help explain each condition. This commit should have no visible effects; it's just code reorganization. Reviewed-on: https://gerrit.openafs.org/13254 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 19cd454f11997d286bc415e9bc9318a31f73e2c6) Change-Id: Ida322c518d11787fd794df7534135fbc2dec2935 Reviewed-on: https://gerrit.openafs.org/14398 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 738eb3e4976947657f877ec107d517d63a66a907 Author: Andrew Deason Date: Sun Jul 8 15:00:02 2018 -0500 afs: Bound afs_DoBulkStat dir scan Currently, afs_DoBulkStat will scan the entire directory blob, looking for entries to stat. If all or almost all entries are already stat'd, we'll scan through the entire directory, doing nontrivial work on each entry (we grab afs_xvcache, at least). All of this work is pretty pointless, since the entries are already cached and so we won't do anything. If many processes are trying to acquire afs_xvcache, this can contribute to performance issues. To avoid this, provide a constant bound on the number of entries we'll search through: nentries * 4. The current arbitrary limits cap nentries at 30, so this means we're capping the afs_DoBulkStat search to 120 entries. Reviewed-on: https://gerrit.openafs.org/13253 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit ba8b92401b8cb2f5a5306313c2702cb36cba083c) Change-Id: Icf82f88328621cb5a9e0ad52f873b8a7d74b1f3a Reviewed-on: https://gerrit.openafs.org/14397 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit fbba8e4cbf00f32181b5075e8d7a876cccd2a046 Author: Andrew Deason Date: Thu Jul 13 17:40:36 2017 -0500 afs: Avoid needless W-locks for afs_FindVCache The callers of afs_FindVCache must hold at least a read lock on afs_xvcache; some hold a shared or write lock (and set IS_SLOCK or IS_WLOCK in the given flags). Two callers (afs_EvalFakeStat_int and afs_DoBulkStat) currently hold a write lock, but neither of them need to. In the optimal case, where afs_FindVCache finds the given vcache, this means that we unnecessarily hold a write lock on afs_xvcache. This can impact performance, since afs_xvcache can be a very frequently accessed lock (a simple operation like afs_PutVCache briefly holds a read lock, for example). To avoid this, have afs_DoBulkStat hold a shared lock on afs_xvcache, upgrading to a write lock when needed. afs_EvalFakeStat_int doesn't ever need a write lock at all, so just convert it to a read lock. Reviewed-on: https://gerrit.openafs.org/12656 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 6c808e05adb0609e02cd61e3c6c4c09eb93c1630) Change-Id: Id517d1098b4c3a02db646b2a74535f77cb684ec3 Reviewed-on: https://gerrit.openafs.org/14396 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 5c476b91fd2259e9c34070be8ba201dd471ad974 Author: Andrew Deason Date: Thu Jul 13 17:40:21 2017 -0500 afs: Change VerifyVCache2 calls to VerifyVCache afs_VerifyVCache is a macro that (on most platforms) effectively expands to: if ((avc->f.states & CStatd)) { return 0; } else { return afs_VerifyVCache2(...); } Some callers call afs_VerifyVCache2 directly, since they already check for CStatd for other reasons. A few callers currently call afs_VerifyVCache2, but without guaranteeing that CStatd is not set. Specifically, in afs_getattr and afs_linux_VerifyVCache, CStatd could be set while afs_CreateReq drops GLOCK. And in afs_linux_readdir, CStatd could be cleared at multiple different points before the VerifyVCache call. This can result in afs_VerifyVCache2 acquiring a write-lock on the vcache, even when CStatd is already set, which is an unnecessary performance hit. To avoid this, change these call sites to use afs_VerifyVCache instead of calling afs_VerifyVCache2 directly, which skips the write lock when CStatd is already set. Reviewed-on: https://gerrit.openafs.org/12655 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit a05d5b7503e466e18f5157006c1de2a2f7d019f7) Change-Id: I05bdcb7f10930ed465c24a8d7e51077a027b1a4b Reviewed-on: https://gerrit.openafs.org/14395 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 150ee65f286409e37fcf94807dbeaf4b79ab0769 Author: Andrew Deason Date: Sun Apr 26 17:26:02 2020 -0500 rx: Use _IsLast to check for last call in queue Ever since commits 170dbb3c (rx: Use opr queues) and d9fc4890 (rx: Fix test for end of call queue for LWP), rx_GetCall checks if the current call is the last one on rx_incomingCallQueue by doing this: opr_queue_IsEnd(&rx_incomingCallQueue, cursor) But opr_queue_IsEnd checks if the given pointer is the _end_ of the last; that is, if it's the end-of-list sentinel, not an item on the actual list. Testing for the last item in a list is what opr_queue_IsLast is for. This is the same convention that the old Rx queues used, but 170dbb3c just accidentally replaced queue_IsLast with opr_queue_IsEnd (instead of opr_queue_IsLast), and d9fc4890 copied the mistake. So because this is inside an opr_queue_Scan loop, opr_queue_IsEnd will never be true, so we'll never enter this block of code (unless we are the "fcfs" thread). This means that an incoming Rx call can get stuck in the incoming call queue, if all of the following are true: - The incoming call consists of more than 1 packet of incoming data. - The incoming call "waits" when it comes in (that is, there are no free threads or the service is over quota). - The "fcfs" thread doesn't scan the incoming call queue (because it is idle when the call comes in, but the relevant service is over quota). To fix this, just use opr_queue_IsLast here instead of opr_queue_IsEnd. Reviewed-on: https://gerrit.openafs.org/14158 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot (cherry picked from commit befc72749884c6752c7789479343ba48c7d5cea1) Change-Id: If724245414798ae7a86dfa048cf99863317aef8e Reviewed-on: https://gerrit.openafs.org/14394 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Stephan Wiesand commit 12aed9ef2988a69187910bada5ba7325cb6144ab Author: Andrew Deason Date: Sun Jul 21 18:48:51 2019 -0500 rx: Avoid osi_NetSend during rx shutdown Commit 8d939c08 (rx: avoid nat ping during shutdown) added a call to shutdown_rx() inside the DARWIN shutdown sequence, before the rx socket was closed. From the commit message, it sounds like this was done to avoid NAT pings from calling osi_NetSend during the shutdown sequence after the rx socket was closed; calling shutdown_rx() before closing the socket would cause any connections we had to be destroyed first, avoiding that. The problem with this is that this means shutdown_rx() is called when osi_StopNetIfPoller is called, which is much earlier than some other portions of the shutdown sequence; some of which may hold references to e.g. rx connections. If we try to, for instance, destroy an rx connection after shutdown_rx() is called, we could panic. An earlier version of that commit (gerrit PS1) just tried to insert a check before the relevant osi_NetSend call, making us just skip the osi_NetSend if the shutdown sequence had been started. So to avoid the above issue, try to implement that approach instead. And instead of doing it just for NAT pings, we can do it for almost all osi_NetSend calls (besides those involved in the shutdown sequence itself), by checking this in rxi_NetSend. Also return an error (ESHUTDOWN) if we skip the osi_NetSend call, so we're not completely silent about doing so. This means we also remove the call to shutdown_rx() inside DARWIN's osi_StopNetIfPoller(). This allows us to interact with Rx objects during more of the shutdown process in cross-platform code. Reviewed-on: https://gerrit.openafs.org/13718 Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk Tested-by: BuildBot (cherry picked from commit 9866511bb0a5323853e97e3ee92524198813776e) Change-Id: Ie62c1a68d8a8889f7a8aa3eff3973c219b45a95c Reviewed-on: https://gerrit.openafs.org/14393 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 7f59989cf84df1e2077f4fcf5649c9624e79a5d2 Author: Andrew Deason Date: Sun Jul 21 18:31:53 2019 -0500 rx: Introduce rxi_NetSend Introduce a small wrapper around osi_NetSend, called rxi_NetSend. This small wrapper allows future commits to change the code around our osi_NetSend calls, without needing to change every single call site, or every implementation of osi_NetSend. Change most call sites to use rxi_NetSend, instead of osi_NetSend. Do not change a few callers in the platform-specific kernel shutdown sequence, since those call osi_NetSend for platform-specific reasons. This commit on its own does not change any behavior with osi_NetSend; it is just code reorganization. Reviewed-on: https://gerrit.openafs.org/13717 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot (cherry picked from commit 2a33a80f7026df6b5e47e42319c55d8b7155675a) Change-Id: I6af8541953a582d044fb668eb4a91720536bc8e1 Reviewed-on: https://gerrit.openafs.org/14392 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit 38f613f5e15552938cb425c68b22c166e35284be Author: Andrew Deason Date: Tue Oct 13 20:18:59 2020 -0500 ubik: Remove unused sampleName The RPC-L type sampleName and related constant UMAXNAMELEN are not referenced by anything, and have been unused since OpenAFS 1.0. Remove the unused definitions. Reviewed-on: https://gerrit.openafs.org/14386 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 83ce8d41c68a5ebcc84132d77af9024c6d285e05) Change-Id: I1d6c583d9c630fc9704578ba3329132e16b3a803 Reviewed-on: https://gerrit.openafs.org/14401 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit f7374a883505aa34a3db34ddd1163367c544bb0c Author: Andrew Deason Date: Sat May 2 23:54:55 2020 -0500 afs: Drop GLOCK for RXAFS_GetCapabilities We are hitting the net here; we certainly should not be holding AFS_GLOCK while waiting for the server's response. Found via FreeBSD WITNESS. Reviewed-on: https://gerrit.openafs.org/14181 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 44b7b93b593371bfdddd0be0ae603f4f8720f78b) Change-Id: I186e08c89136cc3109fd2519bb0d2abbb52f9ba0 Reviewed-on: https://gerrit.openafs.org/14391 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 02fb067d3fded0eee5c9326c56e66b512a75d71c Author: Michael Meffie Date: Mon Mar 23 09:46:05 2020 -0400 build: remove unused LINUX_PKGREL from configure.ac This change removes the unused LINUX_PKGREL definition from the configure.ac file. Commit 6a27e228bac196abada96f34ca9cd57f32e31f5c converted the setting of the RPM package version and release values in the openafs.spec file from autoconf to the makesrpm.pl script. That commit left LINUX_PKGREL in configure.ac because it was still referenced by the Debian packaging, which was still in-tree at that time. Commit ada9dba0756450993a8e57c05ddbcae7d1891582 removed the last trace of the Debian packaging, but missed the removal of the LINUX_PKGREL. Reviewed-on: https://gerrit.openafs.org/14117 Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk (cherry picked from commit 8ae4531c5720baff9e11e4b05706eab6c82de5f9) Conflicts: configure.ac Change-Id: I69925f89c52aef32aea5bc308140936517b1aeb0 Reviewed-on: https://gerrit.openafs.org/14363 Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit 2f2cbff766650e7c25eb332d5385f4a3fca8676d Author: Andrew Deason Date: Mon Apr 20 13:03:15 2020 -0500 ubik: Avoid unlinking garbage during recovery In urecovery_Interact, if any of our operations fail around calling DISK_GetFile, we will jump to FetchEndCall and eventually unlink 'pbuffer'. But if we failed before opening our .DB0.TMP file, the contents of 'pbuffer' will not be initialized yet. During most iterations of the recovery loop, the contents of 'pbuffer' will be filled in from previous loops, and it should always stay the same, so it's not a big problem. But if this is the first iteration of the loop, the contents of 'pbuffer' may be stack garbage. Solve this in two ways. To make sure we don't use garbage contents in 'pbuffer', memset the whole thing to zeroes at the beginning of urecovery_Interact(). And then to make sure we're not reusing 'pbuffer' contents from previous iterations of the loop, also clear the first character to NUL each time we arrive at this area of the recovery code. And avoid unlinking anything if pbuffer starts with a NUL. Commit 44e80643 (ubik: Avoid unlinking garbage) fixes the same issue, but only fixed it in the SDISK_SendFile codepath in remote.c. Reviewed-on: https://gerrit.openafs.org/14153 Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk (cherry picked from commit 98b5ffb52117aefac5afb47b30ce9b87eb2fdebf) Change-Id: I5cadb88e466ddd326ef1e4138d5b1bf89fdb27dc Reviewed-on: https://gerrit.openafs.org/14365 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit 8993e35578e4ae51dd5e8941f77c18bb975e51af Author: Andrew Deason Date: Fri Sep 18 14:03:37 2020 -0600 WINNT: Make opr_threadname_set a no-op We don't supply an implementation for opr_threadname_set for WINNT; don't pretend that we do. Reviewed-on: https://gerrit.openafs.org/13817 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk (cherry picked from commit f895a9b51671ffdc920fd9b4284337c5b737a0ef) Change-Id: Ie8df82550f5420e2b024dea29aae0e39e3ee506f Reviewed-on: https://gerrit.openafs.org/14369 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit 045a97dfbc8dd6a2794b74e16f47984dc5f8eccf Author: Andrew Deason Date: Fri Sep 18 12:11:36 2020 -0600 Move afs_pthread_setname_self to opr Move the functionality in afs_pthread_setname_self from libutil to opr, in a new function opr_threadname_set. This allows us to more easily use the routine in more subsystems, since most code already uses opr. Reviewed-on: https://gerrit.openafs.org/13655 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 9d28f7390332c92b3d9e863c6fe70c26db28b5ad) Change-Id: Ic6bbb615bc3494a7a114a0f4cae711b65ebec111 Reviewed-on: https://gerrit.openafs.org/14368 Reviewed-by: Andrew Deason Tested-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit 41f4bb48741065da6a69ffcb05e451e5c7dac757 Author: Marcio Barbosa Date: Mon Aug 31 19:56:56 2020 +0000 Revert "vos: take RO volume offline during convertROtoRW" This reverts commit 32d35db64061e4102281c235cf693341f9de9271. While that commit did fix the mentioned problem, depending on "vos" to set the volume to be converted as "out of service" is not ideal. Instead, this volume should be set as offline by the SAFSVolConvertROtoRWvolume RPC, executed on the volume server. The proper fix for this problem will be introduced by another commit. Reviewed-on: https://gerrit.openafs.org/14339 Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 85893ac3df0c2cb48776cf1203ec200507b6ce7d) Change-Id: Ie125d2dae1301ca5a4f8323099e6e42bc57b6d28 Reviewed-on: https://gerrit.openafs.org/14361 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit 29cdc0af10cb2a0f4cb4d1b05e72d6f066a1a7a5 Author: Yadavendra Yadav Date: Wed Apr 15 05:33:00 2020 -0500 LINUX: Always crref after _settok_setParentPag Commit b61eac78 (Linux: setpag() may replace credentials) changed PSetTokens2 to call crref() after _settok_setParentPag(), since changing the parent PAG may change our credentials structure. But that commit did not update the old pioctl PSetTokens, so -setpag functionality remained broken on Linux for utilities that called the old pioctl ('klog' is one such utility). To fix this, we could copy the same code from PSetTokens2 into PSetTokens. But instead just move this code into _settok_setParentPag itself, to avoid code duplication. This commit also refactors _settok_setParentPag a little to make the platform-specific ifdefs a little easier to read through. Reviewed-on: https://gerrit.openafs.org/14147 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Yadavendra Yadav Reviewed-by: Benjamin Kaduk (cherry picked from commit 8002a46125e8224ba697c194edba5ad09e4cfc44) Change-Id: I6a9d10428fe470cb38e3ca669f273dde0fa9c875 Reviewed-on: https://gerrit.openafs.org/14328 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit 9553128008c7656d78759aa685a574906d352f3c Author: Yadavendra Yadav Date: Wed Apr 15 05:33:00 2020 -0500 LINUX: Copy session keys to parent in SetToken Commit 48589b5d (Linux: Restore aklog -setpag functionality for kernel 2.6.32+) added code to SetToken() to copy our session keyring to the parent process, in order to implement -setpag functionality. But this was removed from SetToken() in commit 1a6d4c16 (Linux: fix aklog -setpag to work with ktc_SetTokenEx), when the same code was moved to ktc_SetTokenEx(). Add this code back to SetTokens(), so -setpag functionality can work again with utilities that use older functions like ktc_SetToken, like 'klog'. Reviewed-on: https://gerrit.openafs.org/14146 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 826bb826274e48c867b41cb948d031a423373901) Change-Id: I1ae90d92efa27bce2ff59ff9b9dcca370eaf4730 Reviewed-on: https://gerrit.openafs.org/14327 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit 01dfdf2608c5643714da61950637c1e7d1994023 Author: Yadavendra Yadav Date: Fri Aug 21 01:54:00 2020 +0530 afs: Avoid NatPing event on all connection Inside release_conns_user_server, connection vector is traversed and after destroying a connection new eligible connection is found on which NatPing event will be set. Ideally there should be only one connection on which NatPing should be set but in current code while traversing all connection of server a NatPing event is set on all connections to that server. In cases where we have large number of connection to a server this can lead to huge number of “RX_PACKET_TYPE_VERSION” packets sent to a server. Since this happen during Garbage collection of user structs, to simulate this issue below steps were tried - had one script which “cd” to a volume mount and then script sleeps for large time. - Ran one infinite while loop where above script was called using PAG based tokens (As new connection will be created for each PAG) - Instrumented the code, so that we hit above code segment where NatPing event is set. Mainly reduced NOTOKTIMEOUT to 60 sec. To fix this issue set NatPing on one connection and once it is set break from “for” loop traversing the server connection. Reviewed-on: https://gerrit.openafs.org/14312 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason (cherry picked from commit b968875a342ba8f11378e76560b46701f21391e8) Change-Id: I8c70108ab3eb73ed1d9e598381bc29b87ca42aa0 Reviewed-on: https://gerrit.openafs.org/14364 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit e4103743d91bcdba1c57c850eb07b0b03abe72d6 Author: Andrew Deason Date: Mon Jun 22 22:54:52 2020 -0500 volser: Don't NUL-pad failed pread()s in dumps Currently, the volserver SAFSVolDump RPC and the 'voldump' utility handle short reads from pread() for vnode payloads by padding the missing data with NUL bytes. That is, if we request 4k of data for our pread() call, and we only get back 1k of data, we'll write 1k of data to the volume dump stream followed by 3k of NUL bytes, and log messages like this: 1 Volser: DumpFile: Error reading inode 1234 for vnode 5678 1 Volser: DumpFile: Null padding file: 3072 bytes at offset 40960 This can happen if we hit EOF on the underlying file sooner than expected, or if the OS just responds with fewer bytes than requested for any reason. The same code path tries to do the same NUL-padding if pread() returns an error (for example, EIO), padding the entire e.g. 4k block with NULs. However, in this case, the "padding" code often doesn't work as intended, because we compare 'n' (set to -1) with 'howMany' (set to 4k in this example), like so: if (n < howMany) Here, 'n' is signed (ssize_t), and 'howMany' is unsigned (size_t), and so compilers will promote 'n' to the unsigned type, causing this conditional to fail when n is -1. As a result, all of the relevant log messages are skipped, and the data in the dumpstream gets corrupted (we skip a block of data, and our 'howFar' offset goes back by 1). So this can result in rare silent data corruption in volume dumps, which can occur during volume releases, moves, etc. To fix all of this, remove this bizarre NUL-padding behavior in the volserver. Instead: - For actual errors from pread(), return an error, like we do for I/O errors in most other code paths. - For short reads, just write out the amount of data we actually read, and keep going. - For premature EOF, treat it like a pread() error, but log a slightly different message. For the 'voldump' utility, the padding behavior can make sense if a user is trying to recover volume data offline in a disaster recovery scenario. So for voldump, add a new switch (-pad-errors) to enable the padding behavior, but change the default behavior to bail out on errors. Reviewed-on: https://gerrit.openafs.org/14255 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk (cherry picked from commit 4498bd8179e5e93a33468be3c8e7a30e569d560a) Change-Id: Idf89d70c9d4d650dbf7b73e67c5b71b9bab7c3f4 Reviewed-on: https://gerrit.openafs.org/14367 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit 9947625a1d67b4ffdc0582e9081000e34be2b46b Author: Michael Meffie Date: Fri May 15 12:01:44 2020 -0400 vos: avoid CreateVolume when restoring over an existing volume Currently, the UV_RestoreVolume2 function always attempts to create a new volume, even when doing a incremental restore over an existing volume. When the volume already exists, the volume creation operation fails on the volume server with a VVOLEXISTS error. The client will then attempt to obtain a transaction on the existing volume. If a transaction is obtained, the incremental restore operation will proceed. If a full restore is being done, the existing volume is removed and a new empty volume is created. Unfortunately, the failed volume creation is logged to by the volume server, and so litters the log file with: Volser: CreateVolume: Unable to create the volume; aborted, error code 104 To avoid polluting the volume server log with these messages, reverse the logic in UV_RestoreVolume2. Assume the volume already exists and try to get the transaction first when doing an incremental restore. Create a new volume if the transaction cannot be obtained because the volume is not present. When doing a full restore, remove the existing volume, if one exists, and then create a new empty volume. Reviewed-on: https://gerrit.openafs.org/14208 Tested-by: BuildBot Reviewed-by: Marcio Brito Barbosa Tested-by: Marcio Brito Barbosa Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk (cherry picked from commit f5051b87a56b3a4f7fd7188cbd16a663eee8abbf) Change-Id: I422b81e0c800ff655ac8851b2872f4d7160d79a8 Reviewed-on: https://gerrit.openafs.org/14326 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit 967aa95fe23f98c911b8a618132442159eda3f77 Author: Yadavendra Yadav Date: Wed Apr 29 05:10:05 2020 +0000 rxkad: Use krb5_enctype_keysize in tkt_DecodeTicket5 Inside tkt_DecodeTicket5 (rxkad/ticket5.c) function, keysize is calculated using krb5_enctype_keybits and then dividing number of bits by 8. For 3DES number of keybits are 168, so keysize comes out to 21(168/8). However actual keysize of 3DES key is 24. This keysize is passed to _afsconf_GetRxkadKrb5Key where keysize comparison happens, since there is keysize mismatch it returns AFSCONF_BADKEY. To fix this issue get keysize from krb5_enctype_keysize function instead of krb5_enctype_keybits. Thanks to John Janosik (jpjanosi@us.ibm.com) for analyzing and fixing this issue. Reviewed-on: https://gerrit.openafs.org/14203 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk (cherry picked from commit 5d53ed0bdab6fea6d2426691bdef2b6f9cb7f2fe) Change-Id: I16cd7a803a139802671a8045dac09e10ef4ad6cb Reviewed-on: https://gerrit.openafs.org/14325 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit 28e96b954fe17810d1ebdd9fdd4702511e870a67 Author: Jan Iven Date: Tue Sep 1 14:51:25 2020 +0200 ptserver: Remove duplicate ubik_SetLock in listSuperGroups It looks like a call to ubik_SetLock(.. LOCKREAD) was left in place in listSuperGroups after locking was moved to ReadPreamble in commit a6d64d70 (ptserver: Refactor per-call ubik initialisation) When compiled with 'supergroups', and once contacted by "pts mem -expandgroups ..", ptserver will therefore abort() with Ubik: Internal Error: attempted to take lock twice This patch removes the superfluous ubik_SetLock. FIXES 135147 Reviewed-on: https://gerrit.openafs.org/14338 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk (cherry picked from commit 696f2ec67b049639abf04905255a7d6173dbb19e) Change-Id: I62bfe44e374b398278658b61f2ecf9a66fab18ae Reviewed-on: https://gerrit.openafs.org/14345 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Stephan Wiesand commit c7a3154382376ca2e5effdfed2c447c86b9f6eed Author: Michael Meffie Date: Wed Jul 1 21:50:09 2020 -0400 redhat: Add make to the dkms-openafs pre-requirements If `make` is not installed before dkms-openafs, the OpenAFS kernel module is not built during the dkms-openafs package installation. The failure happens in the "checking if linux kernel module build works" configure step, which invokes `make` to check the linux buildsystem. configure fails when `make` is not available, and gives the unhelpful suggestion (in this case) of configuring with --disable-kernel module. Running the configure.log in the dkms build directory shows: configure:7739: checking if linux kernel module build works make -C /lib/modules/4.18.0-193.6.3.el8_2.x86_64/build M=/var/lib/dkms/openafs/... ./configure: line 7771: make: command not found configure: failed using Makefile: Avoid this build failure by adding `make` to the list of dkms-openafs package pre-requirements. Reviewed-on: https://gerrit.openafs.org/14266 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Benjamin Kaduk (cherry picked from commit e61ab9353e99d3298815296abf6b02c50ebe3df0) Change-Id: I9b2bb73acabfabc1cf8b5514c8aa66572cc96066 Reviewed-on: https://gerrit.openafs.org/14314 Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit ddaf6a74ff3e0aeb4c8b2b15886eaa5c90db59bf Author: Cheyenne Wills Date: Fri Aug 28 10:32:01 2020 -0600 INSTALL: document the minimum Linux kernel level The change associated with gerrit #14300 removed support for older Linux kernels (2.6.10 and earlier). The commit 'Import of code from autoconf-archive' (d8205bbb4) introduced a check for Autoconf 2.64. Autoconf 2.64 was released in 2009. The commit 'regen.sh: Use libtoolize -i, and .gitignore generated build-tools' (a7cc505d3) introduced a dependency on libtool's '-i' option. Libtool supported the '-i' option with libtool 1.9b in 2004. Update the INSTALL instructions to document a minimum Linux kernel level and the minimum levels for autoconf and libtool. Notes: RHEL4 (EOL in 2017) had a 2.6.9 kernel and RHEL5 has a 2.6.18 kernel. RHEL5 has libtool 1.5.22 and autoconf 2.59, RHEL6 has libtool 2.2.6 and autoconf 2.63, and RHEL7 has libtool 2.4.2 and autoconf 2.69. Reviewed-on: https://gerrit.openafs.org/14305 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 16bae98ec525fa013514fb46398df682d7637ae0) Change-Id: I7aaf434928204df77851dd2d651d43b86f5b53d2 Reviewed-on: https://gerrit.openafs.org/14331 Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit fee39617c350623dc659e7a21287a1791bdff5d2 Author: Andrew Deason Date: Wed Apr 1 22:59:38 2020 -0500 vos: Print "done" in non-verbose 'vos remsite' Currently, 'vos remsite' always prints the message "Deleting the replication site for volume %lu ...", and then calls VDONE if the operation is successful. VDONE prints the trailing "done", but only if -verbose is turned on, and so if -verbose is not specified, the output of 'vos remsite' looks broken: $ vos remsite fs1 vicepa vol.foo Deleting the replication site for volume 1234 ...Removed replication site fs1 /vicepa for volume vol.foo To fix this, unconditionally print the trailing "done", instead of going through VDONE, so 'vos remsite' output now looks like this: $ vos remsite fs1 vicepa vol.foo Deleting the replication site for volume 1234 ... done Removed replication site fs1 /vicepa for volume vol.foo Reviewed-on: https://gerrit.openafs.org/14127 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit f16d40ad26df3ec871f8c73952594ad2e723c9b4) Change-Id: I4cd7cb2156391004e57cd42437d7174a6bd70992 Reviewed-on: https://gerrit.openafs.org/14311 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit ff5b64d074376992d0be02e19b4da789c20d6bab Author: Marcio Barbosa Date: Thu Feb 13 00:39:00 2020 -0300 vos: take RO volume offline during convertROtoRW The vos convertROtoRW command converts a RO volume into a RW volume. Unfortunately, the RO volume in question is not set as "out of service" during this process. As a result, accesses to the volume being converted can leave volume objects in an inconsistent state. Consider the following scenario: 1. Create a volume on host_b and add replicas on host_a and host_b. $ vos create host_b a vol_1 $ vos addsite host_b a vol_1 $ vos addiste host_a a vol_1 2. Mount the volume: $ fs mkmount /afs/.mycell/vol_1 vol_1 $ vos release vol_1 $ vos release root.cell 3. Shutdown dafs on host_b: $ bos shutdown host_b dafs 4. Remove RO reference to host_b from the vldb: $ vos remsite host_b a vol_1 5. Attach the RO copy by touching it: $ fs flushall $ ls /afs/mycell/vol_1 6. Convert RO copy to RW: $ vos convertROtoRW host_a a vol_1 Notice that FSYNC_com_VolDone fails silently (FSYNC_BAD_STATE), leaving the volume object for the RO copy set as VOL_STATE_ATTACHED (on success, this volume should be set as VOL_STATE_DELETED). 7. Add replica on host_a: $ vos addsite host_a a vol_1 8. Wait until the "inUse" flag of the RO entry is cleared (or force this to happen by attaching multiple volumes). 9. Release the volume: $ vos release vol_1 Failed to start transaction on volume 536870922 Volume not attached, does not exist, or not on line Error in vos release command. Volume not attached, does not exist, or not on line To fix this problem, take the RO volume offline during the vos convertROtoRW operation. Reviewed-on: https://gerrit.openafs.org/14066 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 32d35db64061e4102281c235cf693341f9de9271) Change-Id: Ie4cfab2f04a8859ddfcaece371198ac544066770 Reviewed-on: https://gerrit.openafs.org/14310 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 696d9ce83e3556bda2d3945326937c63dd3560ed Author: Michael Meffie Date: Tue Apr 19 20:46:33 2016 -0400 ubik: positional io for db reads and writes The ubik library was written before positional i/o was available and issues an lseek system call for each database file read and write. This change converts the ubik database accesses to use positional i/o on platforms where pread and pwrite are available, in order to reduce system call load. The new inline uphys_pread and uphys_pwrite functions are supplied on platforms which do not supply pread and pwrite. These functions fall back to non-positional i/o. If these symbols are present in the database server binary then the server process will continue to call lseek before each read and write access of the database file. This change does not affect the whole-file database synchronization done by ubik during database recovery (via the DISK_SendFile and DISK_GetFile RPCs), which still uses non-positional i/o. However, that code does not share file descriptors with the phys.c code, so there is no possibility of mixing positional and non-positional i/o on the same FDs. Reviewed-on: https://gerrit.openafs.org/12272 Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Michael Meffie Reviewed-by: Mark Vitale Reviewed-by: Benjamin Kaduk Tested-by: BuildBot (cherry picked from commit f62fb17b3cf1c886f8cfc2fabe9984070dd3eec4) Change-Id: Iccc8f749c89659f4acebd74a1115425f69610ba8 Reviewed-on: https://gerrit.openafs.org/14142 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Stephan Wiesand commit fffd6e07c87e36cd9a4a36858c3df0c282622195 Author: Michael Meffie Date: Wed Apr 20 18:17:16 2016 -0400 ubik: remove unnecessary lseeks in uphys_open The ubik database file access layer has a file descriptor cache to avoid reopening the database file on each file access. However, the file offset is reset with lseek on each and every use of the cached file descriptor, and the file offset is set twice when reading or writing data records. This change removes unnecessary and duplicate lseek system calls to reduce the system call load. Reviewed-on: https://gerrit.openafs.org/12271 Reviewed-by: Andrew Deason Reviewed-by: Mark Vitale Reviewed-by: Marcio Brito Barbosa Reviewed-by: Michael Meffie Reviewed-by: Benjamin Kaduk Tested-by: BuildBot (cherry picked from commit 6892bfbd701899281b34ee337637d438c7d8f8c6) Change-Id: I5ea7857796d94eb5b659d868e79b9fea2411f301 Reviewed-on: https://gerrit.openafs.org/14141 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Andrew Deason Reviewed-by: Marcio Brito Barbosa Reviewed-by: Stephan Wiesand commit 438bc2c4caf96485e12eda9e7591fc4bd887ebad Author: Andrew Deason Date: Thu Nov 9 12:47:57 2017 -0600 asetkey: Add new 'delete' command variants The current 'delete' command from asetkey only lets the user delete old-style rxkad keys. Add a couple of new variants to allow specifying the key type and subtype, so the user can delete specific key types and enctypes if they want. Reviewed-on: https://gerrit.openafs.org/12767 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot (cherry picked from commit 5120409cc998284f2fb0467c2f88030976140341) Change-Id: I8c762839b50f4faf5e583fb5c510bf2ff9dd2259 Reviewed-on: https://gerrit.openafs.org/14293 Tested-by: BuildBot Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 1c395cd1bcf5770f70dce36ef29f9bd7ba98acaa Author: Kailas Zadbuke Date: Thu May 7 23:55:39 2020 -0400 salvaged: Fix "-parallel all" parsing In salavageserver -parallel option takes "all" argument. However the code does not parse the numeric part correctly. Due to this, only single instance of salvageserver process was running even if we provide the larger number with "all" argument. With this fix, numeric part of "all" argument will be parsed correctly and will start required number of salvageserver instances. Reviewed-on: https://gerrit.openafs.org/14201 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk (cherry picked from commit 4512d04a9b721cd9052c0e8fe026c93faf6edb9e) Change-Id: I8910fb514986a404b22256e8a514955a684c8a27 Reviewed-on: https://gerrit.openafs.org/14285 Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Kailas Zadbuke Reviewed-by: Michael Meffie Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit 38f7e123c33c1c11eb275e0619080ee874244afd Author: Cheyenne Wills Date: Tue Jun 16 15:20:20 2020 -0600 tests: Use usleep instead of nanosleep Commit "Build tests by default" 68f406436cc21853ff854c514353e7eb607cb6cb changes the build so tests are always built. On Solaris 10 the build fails because nanosleep is in librt, which we do not link against. Replace nanosleep with usleep. This avoids introducing extra configure tests just for Solaris 10. Note that with Solaris 11 nanosleep was moved from librt to libc, the standard C library. Reviewed-on: https://gerrit.openafs.org/14244 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 22a66e7b7e1d73437a8c26c2a1b45bc4ef214e77) Change-Id: Ic24c5a149451955b5c130e7b36cec27e02688d83 Reviewed-on: https://gerrit.openafs.org/14291 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit 264d6b786c2fd0aff0e4abe835e42868f936afce Author: Andrew Deason Date: Tue Mar 24 11:59:48 2020 -0500 tests: Wait for server start in auth/superuser-t The auth/superuser-t test runs an Rx server and client in two child processes. If the client process tries to contact the server before the server has started listening on its port, some tests involving RPCs can fail (notably test 39, "Can run a simple RPC"). Normally if we try to contact a server that's not there, Rx will try resending its packets a few times, but on Linux with AFS_RXERRQ_ENV, if the port isn't open at all, we can get an ICMP_PORT_UNREACH error, which causes the relevant Rx call to die immediately with RX_CALL_DEAD. This means that if the auth/superuser-t client is only just a bit faster than the server starting up, tests can fail, since the server's port is not open yet. To avoid this, we can wait until the server's port is open before starting the client process. To do this, have the server process send a SIGUSR1 to the parent after rx_Init() is called, and have the parent process wait for the SIGUSR1 (waiting for a max of 5 seconds before failing). This should guarantee that the server's port will be open by the time the client starts running. Note that before commit 086d1858 (LINUX: Include linux/time.h for linux/errqueue.h), AFS_RXERRQ_ENV was mistakenly disabled on Linux 3.17+, so this issue was probably not possible on recent Linux before that commit. Reviewed-on: https://gerrit.openafs.org/14109 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Cheyenne Wills (cherry picked from commit 66d0f91791695ac585f0511d0dadafd4e570b1bf) Change-Id: Ia6c06ca9a05e33b3bc35238d9c0d18e7ff339438 Reviewed-on: https://gerrit.openafs.org/14290 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit eaa552d27074cd7e1c862d606916b063e6d89a27 Author: Andrew Deason Date: Tue Dec 31 12:04:48 2019 -0600 tests: Introduce afstest_GetProgname Currently, in tests/volser/vos-t.c we call afs_com_err as "authname-t", which is clearly a mistake during some code refactoring (introduced in commit 2ce3fdc5, "tests: Abstract out code to produce a Ubik client"). We could just change this to "vos-t", but instead of specifying constant strings everywhere, change this to figure out what the current command is called, and just use that. Put this code into a new function, afstest_GetProgname, and convert existing tests to use that instead of hard-coding the program name given to afs_com_err. Reviewed-on: https://gerrit.openafs.org/13991 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit a21a2f8edb79d6190976e920a9a90d0878411146) Change-Id: I3d410d6de132f8a0fffeb9cce32a912fe3bbdc20 Reviewed-on: https://gerrit.openafs.org/14289 Reviewed-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit fe57174d84cab19a13eea9a8f31f0ac71c8cf838 Author: Cheyenne Wills Date: Wed Aug 5 09:23:36 2020 -0600 butc: fix int to float conversion warning Building with clang-10 results in 2 warnings/errors associated with with trying to convert 0x7fffffff to a floating point value. tcmain.c:240:18: error: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Werror, -Wimplicit-int-float-conversion] if ((total > 0x7fffffff) || (total < 0)) /* Don't go over 2G */ and the same conversion warning on the statement on the following line: total = 0x7fffffff; Use floating point and decimal constants instead of the hex constants. For the test, use 2147483648.0 which is cleanly represented by a float. Change the comparison in the test from '>' to '>='. If the total value exceeds 2G, just assign the max value directly to the return variable. Reviewed-on: https://gerrit.openafs.org/14277 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 37b55b30c65d0ab8c8eaabfda0dbd90829e2c46a) Change-Id: I16e0acd893049b01a2c5e4c7e71de3fa40e28d3e Reviewed-on: https://gerrit.openafs.org/14299 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Stephan Wiesand commit e7902252f15acfc28453c531f6fa3b29c9c91b92 Author: Cheyenne Wills Date: Fri Aug 21 10:37:51 2020 -0600 LINUX 5.9: Remove HAVE_UNLOCKED_IOCTL/COMPAT_IOCTL Linux-5.9-rc1 commit 'fs: remove the HAVE_UNLOCKED_IOCTL and HAVE_COMPAT_IOCTL defines' (4e24566a) removed the two referenced macros from the kernel. The support for unlocked_ioctl and compat_ioctl were introduced in Linux 2.6.11. Remove references to HAVE_UNLOCKED_IOCTL and HAVE_COMPAT_IOCTL using the assumption that they were always defined. Notes: With this change, building against kernels 2.6.10 and older will fail. RHEL4 (EOL in March 2017) used a 2.6.9 kernel. RHEL5 uses a 2.6.18 kernel. In linux-2.6.33-rc1 the commit messages for "staging: comedi: Remove check for HAVE_UNLOCKED_IOCTL" (00a1855c) and "Staging: comedi: remove check for HAVE_COMPAT_IOCTL" (5d7ae225) both state that all new kernels have support for unlocked_ioctl/compat_ioctl so the checks can be removed along with removing support for older kernels. Reviewed-on: https://gerrit.openafs.org/14300 Reviewed-by: Benjamin Kaduk Tested-by: Benjamin Kaduk (cherry picked from commit 13a49aaf0d5c43bce08135edaabb65587e1a8031) Change-Id: I6dc5ae5b32031641f4a021a31630390a91d834fe Reviewed-on: https://gerrit.openafs.org/14315 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Stephan Wiesand commit 99232fa48020bde90f6b245da4374bc63399654f Author: Cheyenne Wills Date: Wed Aug 5 09:19:02 2020 -0600 afs: Set AFS_VFSFSID to a numerical value Currently when UKERNEL is defined, AFS_VFSFSID is always set to AFS_MOUNT_AFS, which is a string for many platforms for UKERNEL. Update src/afs/afs.h to insure that the define for AFS_VFSFSID is a numeric value when building UKERNEL. Clean up the preprocessor indentation in src/afs/afs.h in the area around the AFS_VFSFSID defines. Thanks to adeason@sinenomine.net for pointing out a much easier solution for resolving this problem. Reviewed-on: https://gerrit.openafs.org/14279 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 446457a1240b88fd94fc34ff5715f2b7f2f3ef12) Change-Id: Ic0f9c2f1f4baeb9f99da19e1187f1bc9f5d7f824 Reviewed-on: https://gerrit.openafs.org/14297 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit 7e2a62fb594a7713a5c694b9e0401b5e3c2648c8 Author: Cheyenne Wills Date: Wed Aug 5 09:18:49 2020 -0600 afs: Avoid using logical OR when setting f_fsid Building with clang-10 produces the warning/error message warning: converting the result of '<<' to a boolean always evaluates to true [-Wtautological-constant-compare] for the expression abp->f_fsid = (AFS_VFSMAGIC << 16) || AFS_VFSFSID; The message is because a logical OR '||' is used instead of a bitwise OR '|'. The result of this expression will always set the f_fsid member to a 1 and not the intended value of AFS_VFSMAGIC combined with AFS_VFSFSID. Update the expression to use a bitwise OR instead of the logical OR. Note: This will change value stored in the f_fsid that is returned from statfs. Using a logical OR has existed since OpenAFS 1.0 for hpux/solaris and in UKERNEL since OpenAFS 1.5 with the commit 'UKERNEL: add uafs_statvfs' b822971a. Reviewed-on: https://gerrit.openafs.org/14292 Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit c56873bf95f6325b70e63ed56ce59a3c6b2b753b) Change-Id: I2e3fe6a84ef6ce73fff933f137d4806efefa5949 Reviewed-on: https://gerrit.openafs.org/14298 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Stephan Wiesand commit fc32922236122b9013ed9bdedda8c54c2e8c75d8 Author: Andrew Deason Date: Fri May 1 15:02:08 2020 -0500 vlserver: Return error when growing beyond 2 GiB In the vlserver, when we add a new vlentry or extent block, we grow the VLDB by doing something like this: vital_header.eofPtr += sizeof(item); Since we don't check for overflow, and all of our offset-related variables are signed 32-bit integers, this can cause some odd behavior if we try to grow the database to be over 2 GiB in size. To avoid this, change the two places in vlserver code that grow the database to use a new function, grow_eofPtr(), which checks for 31-bit overflow. If we are about to overflow, log a message and return an error. See the following for a specific example of our "odd behavior" when we overflow the 2 GiB limit in the VLDB: With 1 extent block, we can create 14509076 vlentries successfully. On the 14509077th vlentry, we'll attempt to write the entry to offset 2147483560 (0x7FFFFFA8). Since a vlentry is 148 bytes long, we'll write all the way through offset 2147483707 (0x8000003B), which is over the 31-bit limit. In the udisk subsystem, this results in writing to page numbers 2097151, and -2097152 (since our ubik pages are 1k, and going over the 31-bit limit causes us to treat offsets as negative). These pages start at physical offsets 2147482688 (0x7FFFFC40) and -2147483584 (-0x7FFFFFC0) in our vldb.DB0 (where offset is page*1024+64). Modifying each of these pages involves reading in the existing page first, modifying the parts we are changing, and writing it back. This works just fine for 2097151, but of course fails for -2097152. The latter fails in DReadBuffer when eventually our pread() fails with EINVAL, and causes ubik to log the message: Ubik: Error reading database file: errno=22 But when DReadBuffer fails, DReadBufferForWrite assumes this is due to EOF, and just creates a new buffer for the given page (DNewBuffer). So, the udisk_write() call ultimately succeeds. When we go to flush the dirty data to disk when committing the transaction, after we have successfully written the transaction log, DFlush() fails for the -2097152 page when the pwrite() call eventually fails with EINVAL, causing ubik to panic, logging the messages: Ubik PANIC: Writing Ubik DB modifications When the vlserver gets restarted by bosserver, we then process the transaction log, and perform the operations in the log before starting up (ReplayLog). The log records the actual data we wrote, not split into pages, and the log-replaying code writes directly to the db usying uphys_write instead of udisk_write. So, because of this, the write actually succeeds when replaying the log, since we just write 148 bytes to offset 2147483624 (0x7FFFFFE8), and no negative offsets are used. The vlserver will then be able to run, but will be unable to read that newly-created vlentry, since it involves reading a ubik page beyond the 31-bit boundary. That means trying to lookup that entry will fail with i/o errors, and as well as any entry on the same hash chains as the new entry (since the new entry will be added to the head of the hash chain). Listing all entries in the database will also just show an empty database, since our vital_header.eofPtr will be negative, and we determine EOF by comparing our current blockindex to the value in eofPtr. Reviewed-on: https://gerrit.openafs.org/14180 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Cheyenne Wills (cherry picked from commit d01398731550b8a93b293800642c3e1592099114) Change-Id: I72302a8c472b3270e99e58a573f5cf25dd34b9c5 Reviewed-on: https://gerrit.openafs.org/14288 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 03979eaa174de0cff188af451806fb7ae10d3a62 Author: Andrew Deason Date: Tue Apr 7 13:15:31 2020 -0500 vlserver: Correctly pad nvlentry for "O" RPCs For our old-style "O" RPCs (e.g. VL_CreateEntry, instead of VL_CreateEntryN), vlserver calls vldbentry_to_vlentry to convert to the internal 'struct nvlentry' format. After all of the sites have been copied to the internal format, we fill the remaining sites by setting the serverNumber to BADSERVERID. For nvldbentry_to_vlentry, we do this for NMAXNSERVERS sites, but for vldbentry_to_vlentry, we do this for OMAXNSERVERS. The thing is, both functions are filling in entries for a 'struct nvlentry', which has NMAXNSERVERS 'serverNumber' entries. So for vldbentry_to_vlentry, we are skipping setting the last few sites (specifically, NMAXNSERVERS-OMAXNSERVERS = 13-8 = 5). This can easily cause our O-style RPCs to write out entries to disk that have uninitialized sites at the end of the array. For example, an entry with one site should have server numbers that look like this: serverNumber = {1, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255} That is, one real serverid (a '1' here), followed by twelve BADSERVERIDs. But for a VL_CreateEntry call, the 'struct nvlentry' is zeroed out before vldbentry_to_vlentry is called, and so the server numbers in the written entry look like this: serverNumber = {1, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 0} That is, one real serverid (a '1' here), followed by seven BADSERVERIDs, followed by five '0's. Most of the time, this is not noticeable, since our code that reads in entries from disk stops processing sites when we encounter the first BADSERVERID site (see vlentry_to_nvldbentry). However, if the entry has 8 sites, then none of the entries will contain BADSERVERID, and so we will actually process the trailing 5 bogus sites. This would appear as 5 extra volume sites for a volume, most likely all for the same server. For VL_CreateEntry, the vlentry struct is always zeroed before we use it, so the trailing sites will always be filled with 0. For VL_ReplaceEntry, the trailing sites will be unchanged from whatever was read in from the existing disk entry. To fix this, just change the relevant loop to go through NMAXNSERVERS entries, so we actually go to the end of the serverNumber (et al) array. This may appear similar to commit ddf7d2a7 (vlserver: initialize nvlentry elements after read). However, that commit fixed a case involving the old vldb database format (which hopefully is not being used). This commit fixes a case where we are using the new vldb database format, but with the old RPCs, which may still be used by old tools. Reviewed-on: https://gerrit.openafs.org/14139 Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk (cherry picked from commit 7e41ee0bd50d39a356f0435ff370a0a7be40306f) Change-Id: I68ecc41e7268efd0424e6c129aa914cd04115b59 Reviewed-on: https://gerrit.openafs.org/14287 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 986ee1fd422d1608553af5a6a02bd60349b864ac Author: Cheyenne Wills Date: Fri Jun 19 08:01:14 2020 -0600 afs: Avoid panics on failed return from afs_CFileOpen afs_CFileOpen is a macro that invokes the open "method" of the afs_cacheOps structure, and for disk caches the osi_UFSOpen function is used. Currently osi_UFSOpen will panic if there is an error encountered while opening a file. Prepare to handle osi_UFSOpen function returning a NULL instead of issuing a panic (future commit). Update callers of afs_CFileOpen to test for an error and to return an error instead of issuing a panic. While this commit eliminates some panics, it does not address some of the more complex cases associated with errors from afs_CFileOpen. Reviewed-on: https://gerrit.openafs.org/14241 Reviewed-by: Andrew Deason Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit d2d27f975df13c3833898611dacff940a5ba3e2a) Change-Id: Ia30711748b3cffd56eb3120961aa1747dfae0f23 Reviewed-on: https://gerrit.openafs.org/14286 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit a00fe9fd4253b2436ae649b05b329d0d52c8b6f6 Author: Yadavendra Yadav Date: Thu Mar 5 07:21:55 2020 +0000 LINUX: Initialize CellLRU during osi_Init When OpenAFS kernel module gets loaded, it will create certain entries in "proc" filesystem. One of those entries is "CellServDB", in case we read "/proc/fs/openafs/CellServDB" without starting "afsd" it will result in crash with NULL pointer deref. The reason for crash is CellLRU has not been initialized yet (since "afsd" is not started) i.e afs_CellInit is not yet called, because of this "next" and "prev" pointers will be NULL. Inside "c_start()" we do not check for NULL pointer while traversing CellLRU and this causes crash. To avoid this initialize CellLRU during module intialization. Reviewed-on: https://gerrit.openafs.org/14093 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 8d90a9d27b0ef28ddcdd3eb041c8a9d019b84b50) Change-Id: I9ed97d3751943331c9d9bc9dfc73acc24231187b Reviewed-on: https://gerrit.openafs.org/14284 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 6824d45c2ab83e52350c3f366e4cb6f1eb263088 Author: Kailas Zadbuke Date: Wed Jun 3 15:44:08 2020 +0530 util: Handle serverLogMutex lock across forks If a process forks when another thread has serverLogMutex locked, the child process inherits the locked serverLogMutex. This causes a deadlock when code in the child process tries to lock serverLogMutex, since we can never unlock serverLogMutex because the locking thread no longer exists. This can happen in the salvageserver, since the salvageserver locks serverLogMutex in different threads, and forks to handle salvage jobs. To avoid this deadlock, we register handlers using pthread_atfork() so that the serverLogMutex will be held during the fork. The fork will be blocked until the worker thread releases the serverLogMutex. Hence the serverLogMutex will be held until the fork is complete and it will be released in the parent and child threads. Thanks to Yadavendra Yadav(yadayada@in.ibm.com) for working with me on this issue. Reviewed-on: https://gerrit.openafs.org/14239 Reviewed-by: Benjamin Kaduk Tested-by: BuildBot (cherry picked from commit e44d6441c8786fdaaa1fad1b1ae77704c12f7d60) Change-Id: I09c04c0bd99b10433857ccdaeb4ee6a4cd50f768 Reviewed-on: https://gerrit.openafs.org/14283 Tested-by: BuildBot Reviewed-by: Kailas Zadbuke Reviewed-by: Andrew Deason Reviewed-by: Cheyenne Wills Reviewed-by: Stephan Wiesand commit 179a418ea5063785a23e4faf35134f063a6f3e1c Author: Andrew Deason Date: Fri Mar 13 13:00:35 2020 -0500 LINUX: Properly revert creds in osi_UFSTruncate Commit cd3221d3 (Linux: use override_creds when available) caused us to force the current process's creds to the creds of afsd during osi_file.c file ops, to avoid access errors in some cases. However, in osi_UFSTruncate, one code path was missed to revert our creds back to the original user's creds: when the afs_osi_Stat call fails or deems the truncate unnecessary. In this case, the calling process keeps the creds for afsd after osi_UFSTruncate returns, causing our subsequent access-checking code to think that the current process is in the same context as afsd (typically uid 0 without a pag). This can cause the calling process to appear to transiently have the same access as non-pag uid 0; typically this will be unauthenticated access, but could be authenticated if uid 0 has tokens. To fix this, modify the early return in osi_UFSTruncate to go through a 'goto done' destructor instead, and make sure we revert our creds in that destructor. Thanks to cwills@sinenomine.net for finding and helping reproduce the issue. Reviewed-on: https://gerrit.openafs.org/14098 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Jeffrey Hutzelman Reviewed-by: Cheyenne Wills Tested-by: Cheyenne Wills (cherry picked from commit 57b4f4f9be1e25d5609301c10f717aff32aef676) Change-Id: I714eb2dea9645ffe555f26b5d69707a7afbe8d81 Reviewed-on: https://gerrit.openafs.org/14099 Reviewed-by: Andrew Deason Reviewed-by: Jeffrey Hutzelman Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Stephan Wiesand commit ee578e92d9f810d93659a9805d0c12084fe2bb95 Author: Jeffrey Hutzelman Date: Thu May 2 16:02:47 2019 -0400 Linux: use override_creds when available Linux may perform some access control checks at the time of an I/O operation, rather than relying solely on checks done when the file is opened. In some cases (e.g. AppArmor), these checks are done based on the current tasks's creds at the time of the I/O operation, not those used when the file was open. Because of this, we must use override_creds() / revert_creds() to make sure we are using privileged credentials when performing I/O operations on cache files. Otherwise, cache I/O operations done in the context of a task with a restrictive AppArmor profile will fail. Reviewed-on: https://gerrit.openafs.org/13751 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk (cherry picked from commit cd3221d3532a28111ad22d4090ec913cbbff40da) Change-Id: I8955ff6150462fecba9a10a8f99bce9ee8163435 Reviewed-on: https://gerrit.openafs.org/14082 Reviewed-by: Cheyenne Wills Reviewed-by: Andrew Deason Tested-by: BuildBot Reviewed-by: Michael Meffie Reviewed-by: Jeffrey Hutzelman Reviewed-by: Stephan Wiesand commit facff58b840a47853592510617ba7a1da2e3eaa9 Author: Cheyenne Wills Date: Fri Jul 3 10:35:06 2020 -0600 LINUX 5.8: use lru_cache_add With Linux-5.8-rc1 commit 'mm: fold and remove lru_cache_add_anon() and lru_cache_add_file()' (6058eaec), the lru_cache_add_file function is removed since it was functionally equivalent to lru_cache_add. Replace lru_cache_add_file with lru_cache_add. Introduce a new autoconf test to determine if lru_cache_add is present For reference, the Linux changes associated with the lru caches: __pagevec_lru_add introduced before v2.6.12-rc2 lru_cache_add_file introduced in v2.6.28-rc1 __pagevec_lru_add_file replaces __pagevec_lru_add in v2.6.28-rc1 vmscan: split LRU lists into anon & file sets (4f98a2fee) __pagevec_lru_add removed in v5.7 with a note to use lru_cache_add_file mm/swap.c: not necessary to export __pagevec_lru_add() (bde07cfc6) lru_cache_add_file removed in v5.8 mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() (6058eaec) lru_cache_add exported mm: fold and remove lru_cache_add_anon() and lru_cache_add_file() (6058eaec) Openafs will use: lru_cache_add on 5.8 kernels lru_cache_add_file from 2.6.28 through 5.7 kernels __pagevec_lru_add/__pagevec_lru_add_file on pre 2.6.28 kernels Reviewed-on: https://gerrit.openafs.org/14249 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Yadavendra Yadav Reviewed-by: Benjamin Kaduk (cherry picked from commit 7d85ce221d6ccc19cf76ce7680c74311e4ed2632) Change-Id: Iba6ef4441687dbf60d227a708e2a032c2c0dc79f Reviewed-on: https://gerrit.openafs.org/14269 Tested-by: BuildBot Reviewed-by: Michael Laß Reviewed-by: Stephan Wiesand commit 335f37be13d2ff954e4aeea617ee66502170805e Author: Cheyenne Wills Date: Fri Jul 3 10:34:42 2020 -0600 LINUX 5.8: do not set name field in backing_dev_info Linux-5.8-rc1 commit 'bdi: remove the name field in struct backing_dev_info' (1cd925d5838) Do not set the name field in the backing_dev_info structure if it is not available. Uses an existing config test 'STRUCT_BACKING_DEV_INFO_HAS_NAME' Note the name field in the backing_dev_info structure was added in Linux-2.6.32 Reviewed-on: https://gerrit.openafs.org/14248 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit d8ec294534fcdee77a2ccd297b4b167dc4d5573d) Change-Id: I3d9e18092db998a4c4f26bd63ee3b75383a53d4c Reviewed-on: https://gerrit.openafs.org/14268 Tested-by: BuildBot Reviewed-by: Michael Laß Reviewed-by: Stephan Wiesand commit d7fc5bf9bf031089d80703c48daf30d5b15a80ca Author: Cheyenne Wills Date: Fri Jul 3 10:33:51 2020 -0600 LINUX 5.8: Replace kernel_setsockopt with new funcs Linux 5.8-rc1 commit 'net: remove kernel_setsockopt' (5a892ff2facb) retires the kernel_setsockopt function. In prior kernel commits new functions (ip_sock_set_*) were added to replace the specific functions performed by kernel_setsockopt. Define new config test 'HAVE_IP_SOCK_SET' if the 'ip_sock_set' functions are available. The config define 'HAVE_KERNEL_SETSOCKOPT' is no longer set in Linux 5.8. Create wrapper functions that replace the kernel_setsockopt calls with calls to the appropriate Linux kernel function(s) (depending on what functions the kernel supports). Remove the unused 'kernel_getsockopt' function (used for building with pre 2.6.19 kernels). For reference Linux 2.6.19 introduced kernel_setsockopt Linux 5.8 removed kernel_setsockopt and replaced the functionality with a set of new functions (ip_sock_set_*) Reviewed-on: https://gerrit.openafs.org/14247 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit c48072b9800759ef1682b91ff1e962f6904a2594) Change-Id: I2724fad06b1882149d2066d13eced55eff5ee695 Reviewed-on: https://gerrit.openafs.org/14267 Tested-by: BuildBot Reviewed-by: Michael Laß Reviewed-by: Stephan Wiesand commit 0f67e733e82a9001f3f9253c5e1880be845d537b Author: Cheyenne Wills Date: Thu Apr 2 13:27:50 2020 -0600 LINUX: Include linux/time.h for linux/errqueue.h The configuration test for errqueue.h fails with an undefined structure error on a Linux 3.17 (or higher) system. This prevents setting HAVE_LINUX_ERRQUEUE_H, which is used to define AFS_RXERRQ_ENV. Linux commit f24b9be5957b38bb420b838115040dc2031b7d0c (net-timestamp: extend SCM_TIMESTAMPING ancillary data struct) - which was picked up in linux 3.17 added a structure that uses the timespec structure. After this commit, we need to include linux/time.h to pull in the definition of the timespec struct. Reviewed-on: https://gerrit.openafs.org/13950 Reviewed-by: Andrew Deason Tested-by: Andrew Deason Reviewed-by: Benjamin Kaduk (cherry picked from commit 086d185872da5f19447cf5ec7846e7ce5104563f) Change-Id: I67d01b11c5ea95b8b87832fc833f8fc850ade684 Reviewed-on: https://gerrit.openafs.org/14130 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk Reviewed-by: Andrew Deason Reviewed-by: Stephan Wiesand commit 5a14bd0abe83b580f0cc7a200ae963399ab7de5f Author: Cheyenne Wills Date: Tue May 26 12:11:28 2020 -0600 vol: Fix format-truncation warning with gcc-10.1 Building with gcc-10.1 produces a warning (error if --enable-checking) in vol-salvage.c error: ‘%s’ directive output may be truncated writing up to 755 bytes into a region of size 255 [-Werror=format-truncation=] 809 | snprintf(inodeListPath, 255, "%s" OS_DIRSEP "salvage.inodes.%s.%d", tdir, name, Use strdup/asprintf to allocate the buffer dynamically instead of using a buffer with a hardcoded size. Reviewed-on: https://gerrit.openafs.org/14207 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit d73680c5f70ee5aeb634a9ec88bf1097743d0f76) Change-Id: I8d3bf244a70f723f585738905deea7ddfb1bb862 Reviewed-on: https://gerrit.openafs.org/14232 Reviewed-by: Mark Vitale Tested-by: Cheyenne Wills Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit d5fc5283e91ea94a67df8364a5b8bf8970ffe934 Author: Michael Meffie Date: Mon Oct 9 22:16:09 2017 -0400 afsmonitor: remove unused LWP_WaitProcess Remove the unimplemented once-only flag and the unused LWP_WaitProcess call. Reviewed-on: https://gerrit.openafs.org/12745 Tested-by: BuildBot Reviewed-by: Benjamin Kaduk (cherry picked from commit 7c27365ea24aed5787f6fc03f30f6085c78ece51) Change-Id: I3b61f9fb4f45564304b0e35878d3535a10e31d02 Reviewed-on: https://gerrit.openafs.org/14226 Reviewed-by: Andrew Deason Reviewed-by: Michael Meffie Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Stephan Wiesand commit a2eec64374d6b754b29c509b554573cb6e53eb46 Author: Cheyenne Wills Date: Fri May 22 12:16:48 2020 -0600 Avoid duplicate definitions of globals GCC 10 changed a default flag from -fcommon to -fno-common. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85678 for some background. The change in gcc 10 results in build link-time errors. For example: ../../src/xstat/.libs/liboafs_xstat_cm.a(xstat_cm.o):(.bss+0x2050): multiple definition of `numCollections'; Ensure that only one definition for global data objects exist and change references to use "extern" as needed. To ensure that future changes do not introduce duplicated global definitions, add the -fno-common flag to XCFLAGS when using the configure --enable-checking setting. [cwills@sinenomine.net: Note for 1.8.x: renamed terminationEvent to cm_terminationEvent/fs_terminationEvent instead of deleting it.] Reviewed-on: https://gerrit.openafs.org/14106 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Benjamin Kaduk (cherry picked from commit 0e2072ae386d4111bef161eb955964b649c31386) Change-Id: I54ca61d372cf763e4a28c0b0829ea361219f6203 Reviewed-on: https://gerrit.openafs.org/14217 Reviewed-by: Andrew Deason Reviewed-by: Mark Vitale Tested-by: BuildBot Reviewed-by: Stephan Wiesand