Sysadmin

5591 readers
1 users here now

A community dedicated to the profession of IT Systems Administration

founded 5 years ago
MODERATORS
1
25
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 
 

I ~~setup~~ (took over and spruced up, to be precise) this community specifically because of the time I've spent over the years browsing and relying on reddit.com/r/sysadmin for sources of information on tips/tricks, security exploits & patches, outages, and yes even the ranting about how our jobs all suck. (I like mine, for what it's worth.)

Come on down, ask questions, post what the sysadmin community needs to know about, or head in to get either sympathy or chastisement about why you haven't left your job yet. 🤣

Want to be a mod? Let me know!

2
 
 

Programs with custom services, virtual environments, config files in different locations, programs creating datas in different location...

I know today a lot of stuff runs in docker, but how does a sysadmin remember what has done on its system? Is it all about documenting and keeping your docs updated? Is there any other way?

(Eg. For installing calibre-web I had to create a python venv, the venv is owned by root in /opt, but the service starting calibre web in /etc/systemd/system needs to be executed with the User=<user> specifier because calibre web wants to write in a user home directory, at the same time the database folder needs to be owned by www-data because I want to r/w it from nextcloud... So calibreweb is installed as a custom root(?) program, running in a virtual env, can access a folder owned by someone else, but still needs to be executed by another user to store its data there... )

Despite my current confusion in understanding if all of this is right in terms of security, syntax and ownership, No fucking way I will remember all this stuff in a week from now.. So... What do you use to do, if you do something? Do you use flowcharts? Simple text documents? Both?

Essentially, how do you keep track?

3
 
 

This article will describe how to download an image from a (docker) container registry.

Manual Download of Container Images with wget and curl
Manual Download of Container Images with wget and curl

Intro

Remember the good `'ol days when you could just download software by visiting a website and click "download"?

Even apt and yum repositories were just simple HTTP servers that you could just curl (or wget) from. Using the package manager was, of course, more secure and convenient -- but you could always just download packages manually, if you wanted.

But have you ever tried to curl an image from a container registry, such as docker? Well friends, I have tried. And I have the scars to prove it.

It was a remarkably complex process that took me weeks to figure-out. Lucky you, this article will break it down.

Examples

Specifically, we'll look at how to download files from two OCI registries.

  1. Docker Hub
  2. GitHub Packages

Terms

First, here's some terminology used by OCI

  1. OCI - Open Container Initiative
  2. blob - A "blob" in the OCI spec just means a file
  3. manifest - A "manifest" in the OCI spec means a list of files

Prerequisites

This guide was written in 2024, and it uses the following software and versions:

  1. debian 12 (bookworm)
  2. curl 7.88.1
  3. OCI Distribution Spec v1.1.0 (which, unintuitively, uses the '/v2/' endpoint)

Of course, you'll need 'curl' installed. And, to parse json, 'jq' too.

sudo apt-get install curl jq

What is OCI?

OCI stands for Open Container Initiative.

OCI was originally formed in June 2015 for Docker and CoreOS. Today it's a wider, general-purpose (and annoyingly complex) way that many projects host files (that are extremely non-trivial to download).

One does not simply download a file from an OCI-complianet container registry. You must:

  1. Generate an authentication token for the API
  2. Make an API call to the registry, requesting to download a JSON "Manifest"
  3. Parse the JSON Manifest to figure out the hash of the file that you want
  4. Determine the download URL from the hash
  5. Download the file (which might actually be many distinct file "layers")
One does not simply download from a container registry
One does not simply download from a container registry

In order to figure out how to make an API call to the registry, you must first read (and understand) the OCI specs here.

OCI APIs

OCI maintains three distinct specifications:

  1. image spec
  2. runtime spec
  3. distribution spec

OCI "Distribution Spec" API

To figure out how to download a file from a container registry, we're interested in the "distribution spec". At the time of writing, the latest "distribution spec" can be downloaded here:

The above PDF file defines a set of API endpoints that we can use to query, parse, and then figure out how to download a file from a container registry. The table from the above PDF is copied below:

ID Method API Endpoint Success Failure
end-1 GET /v2/ 200 404/401
end-2 GET / HEAD /v2/<name>/blobs/<digest> 200 404
end-3 GET / HEAD /v2/<name>/manifests/<reference> 200 404
end-4a POST /v2/<name>/blobs/uploads/ 202 404
end-4b POST /v2/<name>/blobs/uploads/?digest=<digest> 201/202 404/400
end-5 PATCH /v2/<name>/blobs/uploads/<reference> 202 404/416
end-6 PUT /v2/<name>/blobs/uploads/<reference>?digest=<digest> 201 404/400
end-7 PUT /v2/<name>/manifests/<reference> 201 404
end-8a GET /v2/<name>/tags/list 200 404
end-8b GET /v2/<name>/tags/list?n=<integer>&last=<integer> 200 404
end-9 DELETE /v2/<name>/manifests/<reference> 202 404/400/405
end-10 DELETE /v2/<name>/blobs/<digest> 202 404/405
end-11 POST /v2/<name>/blobs/uploads/?mount=<digest>&from=<other_name> 201 404
end-12a GET /v2/<name>/referrers/<digest> 200 404/400
end-12b GET /v2/<name>/referrers/<digest>?artifactType=<artifactType> 200 404/400
end-13 GET /v2/<name>/blobs/uploads/<reference> 204 404

In OCI, files are (cryptically) called "blobs". In order to figure out the file that we want to download, we must first reference the list of files (called a "manifest").

The above table shows us how we can download a list of files (manifest) and then download the actual file (blob).

Examples

Let's look at how to download files from a couple different OCI registries:

  1. Docker Hub
  2. GitHub Packages

Docker Hub

To see the full example of downloading images from docker hub, click here

GitHub Packages

To see the full example of downloading files from GitHub Packages, click here.

Why?

I wrote this article because many, many folks have inquired about how to manually download files from OCI registries on the Internet, but their simple queries are usually returned with a barrage of useless counter-questions: why the heck would you want to do that!?!

The answer is varied.

Some people need to get files onto a restricted environment. Either their org doesn't grant them permission to install software on the machine, or the system has firewall-restricted internet access -- or doesn't have internet access at all.

3TOFU

Personally, the reason that I wanted to be able to download files from an OCI registry was for 3TOFU.

Verifying Unsigned Releases with 3TOFU
Verifying Unsigned Releases with 3TOFU

Unfortunaetly, most apps using OCI registries are extremely insecure. Docker, for example, will happily download malicious images. By default, it doesn't do any authenticity verifications on the payloads it downloaded. Even if you manually enable DCT, there's loads of pending issues with it.

Likewise, the macOS package manager brew has this same problem: it will happily download and install malicious code, because it doesn't use cryptography to verify the authenticity of anything that it downloads. This introduces watering hole vulnerabilities when developers use brew to install dependencies in their CI pipelines.

My solution to this? 3TOFU. And that requires me to be able to download the file (for verification) on three distinct linux VMs using curl or wget.

⚠ NOTE: 3TOFU is an approach to harm reduction.

It is not wise to download and run binaries or code whose authenticity you cannot verify using a cryptographic signature from a key stored offline. However, sometimes we cannot avoid it. If you're going to proceed with running untrusted code, then following a 3TOFU procedure may reduce your risk, but it's better to avoid running unauthenticated code if at all possible.

Registry (ab)use

Container registries were created in 2013 to provide a clever & complex solution to a problem: how to package and serve multiple versions of simplified sources to various consumers spanning multiple operating systems and architectures -- while also packaging them into small, discrete "layers".

However, if your project is just serving simple files, then the only thing gained by uploading them to a complex system like a container registry is headaches. Why do developers do this?

In the case of brew, their free hosing provider (JFrog's Bintray) shutdown in 2021. Brew was already hosting their code on GitHub, so I guess someone looked at "GitHub Packages" and figured it was a good (read: free) replacement.

Many developers using Container Registries don't need the complexity, but -- well -- they're just using it as a free place for their FOSS project to store some files, man.

4
 
 

TLDR: An AMI testkey was used in production by a bunch of manufacturers. The key has now been leaked.

5
 
 

cross-posted from: https://sh.itjust.works/post/22460079

Today I'm grateful I'm using Linux - Global IT issues caused by Crowdstrike update causes BSOD on Windows

This isn't a gloat post. In fact, I was completely oblivious to this massive outage until I tried to check my bank balance and it wouldn't log in.

Apparently Visa Paywave, banks, some TV networks, EFTPOS, etc. have gone down. Flights have had to be cancelled as some airlines systems have also gone down. Gas stations and public transport systems inoperable. As well as numerous Windows systems and Microsoft services affected. (At least according to one of my local MSMs.)

Seems insane to me that one company's messed up update could cause so much global disruption and so many systems gone down :/ This is exactly why centralisation of services and large corporations gobbling up smaller companies and becoming behemoth services is so dangerous.

6
 
 

cross-posted from: https://lemmy.ml/post/18154572

All our servers and company laptops went down at pretty much the same time. Laptops have been bootlooping to blue screen of death. It's all very exciting, personally, as someone not responsible for fixing it.

Apparently caused by a bad CrowdStrike update.

7
 
 

Hello! I am looking for suggestions for Slack alternatives that meet the following (likely impossible) criteria:

  • Modern UI
  • Self-hosted FOSS
  • Actively developed, or at least stable and maintained
  • Comprehensive API for integrations
  • Non-shit strategy for determining which device to send notifications to

Regarding UI, I am hoping to find something with a more streamlined implementation of threaded conversations - this is my primary complaint with Slack.

I know there are tons of articles on Slack alternatives, but I'm hoping for a more technical perspective. Are there any Matrix-based options that are refined enough for a small team to rely on as primary method of communication?

Thank you!

8
 
 

I am working part time for a small company, they have about 40 employees that use the email everyday for work and recently they have acquired a MS account for 10 employees that use it mainly for teams with customers but also sharepoint, etc.

To buy an MS account for each of the 40 would be too expensive and necessary because the other 30 only really use email in the day to day work.

So what I did initially was to follow this Microsoft doc: https://learn.microsoft.com/en-us/exchange/mail-flow-best-practices/how-to-set-up-a-multifunction-device-or-application-to-send-email-using-microsoft-365-or-office-365

So our MX register point to Exchange server and exchange relay it to the secondary email server where all those 30 accounts exists.

It was working fine until I we started to get this "Not delivered message" email returning with this error:

Error:	550 5.7.367 Remote server returned not permitted to relay -> 554 5.7.1 : Relay access denied

I talked to the support of this secondary email server and they told me they do not support this operation.

So I am looking for help in finding some server that would allow me to work like this. Do you happen to know some company you could recommend?

9
 
 

Hi all, I want to setup a fileserver as a KVM which will access a 2TB disk partition to store its data. In order to do this I saw 5 options:

  1. Attach the whole disk to the VM and access the partition as you do in the host machine. -> contraindicated by the RHEL documentation for security reasons.

  2. Attach only the partition to the VM. Inside the VM, the partition appears as a drive which needs a new partition table. This seems good to me (for reasons I'll explain later), but I don't know how the partition-table-inside-a-partition thing works and what implications it comes with.

  3. Create a sparse max-2TB qcow2 image, store it in the physical partition and attach it to the VM. -> rejected by me because the partition inside the qcow2 image needs constant resizing as your storage needs grow.

  4. Create a fully initialized 2TB qcow2 image. -> current way of doing it, no resizes, no security concerns (I guess). The only drawback I perceive is the time required to initialize a 2TB image (~2.5hours in an HDD).

  5. Use the physical partition as NFS. I haven't really investigated this solution -nor am I experienced with NFS- but to me it seems like it will require some configuration in the host too, which is something I want to avoid because I don't want to redeploy the host in case shit hits the fan.

So, why 2 seems good to me? Neither resizes as in 3 nor long setup times (image initializing) as in 4.

Is there any other solution that I have missed? If not, out of these, which should I choose?

Sorry for the long, I tried to be as detailed as possible.

10
18
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 
 

Not sure if this is the right place.

The last few days I've been experiencing a few issues resolving DNS on my home network. Strangely, rebooting the router seemed to fix it for a time. After running into the issue again I decided to investigate further. I'm using a Mikrotik router with my PC wired in with ethernet cable. The router is using DoH to Quad9 (https://dns.quad9.net/dns-query as per their documentation). I've also imported root certificates for validation.

As of right now, my desktop cannot resolve dns against 9.9.9.9, however it can resolve dns against 1.1.1.1 and 8.8.8.8.

$ dig @9.9.9.9 reddit.com

;; communications error to 9.9.9.9#53: timed out

Interestingly also cannot curl the DoH URL (also a timeout). I thought maybe Quad9 is having issues so I jumped over to my EC2 instance, and I can dig/curl just fine.

I also turned on debug logging on the router, the logs indicate the same issue my desktop is having (timeout errors, sometimes and SSL handshake error).

My question to you all is, have I missed something in my testing/setup, or is Comcast blocking Quad9?

Additional info:

The mikrotik is the latest firmware (6.49.10). I can switch to CloudFlare DoH on the router and it works fine. I can remove the DoH setting entirely and it works. I've got 8.8.8.8 as a static DNS server and the 2 comcast dns servers are dynamic (75.75.75.75 and 75.75.76.76). NTP is setup and the router has the correct date/time/timezone.

As of this writing rebooting the router is no longer temporarily fixing the problem.

Edit:

Thanks u/[email protected] !

Per their post the status page shows issues in my area: https://uptime.quad9.net/

11
 
 

Not sure if this is the right place to ask, but recommendations for personal and family password management?

I finally switched to Firefox on my phone, because Chrome "privacy". And then when trying to find out how enable password storage, I accidentally set up Microsoft Authenticator as password management phone-wide. Realizing this meant cross-app password management, I finally accepted that my old approach of politely ignoring the problem and manually memorizing algorithmic passwords is no longer tenable. I honestly would prefer the anti-privacy approach where every service just uses oAuth and only one provider has my password, but we're not there today, so time to learn the new tech.

So basically, what's the current OSS best-practice for a one-stop-shop password management software? I know "OSS" and "big safe cloud storage provider" are kind of oxymoronic, but imho encrypted-cloud-storage is the best tradeoff between security and convenience.

And, ideally, something I could get my kids onto as well and manage some shared family-PWs as well, since I assume their password management strategies are either "reset every time" or "just use the same PW everywhere and it's a ticking time-bomb".

12
 
 

Hey all,

I want to start using btrfs on my san/nas and use that as a backend for my nextCloud. Before I had read up on btrfs I was thinking about using RAID1. I thought RAID1 would fulfill my two requirements:

  • It would allow me to just pull out a disk and put in a usb dock and read its contents. (disaster recovery, or for my SO to just power down the server and get her data off if something happens to me).
  • It would simply Mirror the data so a single drive can fail and everything is fine.

Now I read things on the documentation of btrfs and in some other places that the RAID1 implementation of btrfs is non-standard, in that is also has some striping functionality.

The image included is from the btrfs docs and it seems it also stripes, not just mirrors, when using 4 disks.

Now my question is: What is its behaviour when using 2 disks? Will this fullfill my two requirements? If not do you have any other recommendations? (I mean i could use zfs...)

A penny for your thoughs :-).

13
11
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 
 

Hey all,

Edit: changed confusing wording based on dack's comment.

I have a problem. I'm building a SAN and I'm playing around with btrfs to learn more about how to use it.

I run into the problem where my sdd1 partition is recognized as swap filesystem. I don't understand whats going on here. I formatted all these drives through my usb-dock via my desktop. All the others are fine, so why is this one giving me problems? I tried removing it with parted amd recreating it as btrfs or ext4 doesnt seem to help.

Does anyone have any insight of why this is happening?

root@server :~# lsblk --fs
....
sdc                                                                                                           
└─sdc1                       ext4        1.0            e3e8849d-a25e-4235-8ebf-ca84a7637f64                  
sdd                                                                                                           
└─sdd1                       swap        1              445ae89e-05ef-4fd0-98e3-b592fb2a8a9c                  
sde                                                                                                           
└─sde1                       btrfs                      bc864736-2bf6-4379-aa57-46f1c0f3a95d 
14
 
 

Hey all.

I need some advice on how to deal with the adhoc vs planned work. There are emails, tickets and verbal interruptions that need my attention. Additionally there is an incrr sing amount of meetings I need to attend. At the same time I want to focus on the development of the infrastructure for the planned work. I notice that all the interruptions are detrimental to both the planned and the adhoc work.

The fact that I have to switch my attention all the time and can't just focus starts to frustrate me. It also has to do with my adhd. I cant utilize my hyperfocus to finish the planned work, instead it stimulates the attention switching side of my adhd and cant get into the problem. I just notice I am not as effective as I was before I got this workload.

Do you people recognize this struggle? How do you deal with this?

15
 
 

I’ve started at a medium-sized org (~1500 users) that has over a dozen global admins in 365, plus another 80 users with various 365 admin access. Does anyone have any tips for how to identify what access the users actually need?

I tried punching up a questionnaire with all of the available options, but my test group reported that it was too convoluted. I’m not sure how I can better identify their needs without interviewing them one-on-one, or just ripping away access and seeing who screams.

16
17
18
 
 

Hey all,

I would like to get the above certifications. What resources did you use to study? I can't afford the official training and my employer doesn't want to pay for it.

Any and all help, and all tales of your experience is aplriciated.

19
 
 

Hopefully not a lot of you had their business use BlueJeans as their core videoconferencing software. Because if you do, you’ll want to plan a migration soon.

20
 
 

As a sysadmin mostly used to the nice and powerful way Postgres manages dates, every time i’ve had to do stuff on SQLite i find myself missing that. Feels like they offloaded that into whatever code connects to the database instead of handling it at DB level.

Is there a way to give SQLite the powerful and reliable date management Postgres has, or at least something similar? Hopefully something as devoid of dependency hell as SQLite itself is

21
 
 

I have 15 VM's running for clients and I'm looking for a way to keep the tools up to date without having to connect to each server and do it manually. A few examples are WinDirStat, Firefox, SSMS, Filelocator, etc.

We have expanded recently and I'm at the limits of doing this manually. These servers are not domain joined and are in separate virtual networks.

22
 
 

So, I have a VM DC that I had to restore from a month ago. I had other DCs that were physical and up. My understanding that if sub 60 days "off" it is fine to basically "power back on" the snapshot. However, now the "restored" DC has disabled replication in both directions. Should I manually enable inbound replication first and then after a while enable outbound replication?

Or a better fix method?

23
 
 

Thanks guys, gals and everybody in between.

24
 
 

Hi! I've inherited a machine installed by somebody else who's no longer in the company or the country. The machine is running just fine, but i see no Dockerfiles or docker-compose.yml, and this looks like something that came from a Compose file with a few linked containers.

Is it possible to reconstruct that info from the running containers? I'm still a raw Docker newbie at this point so i don't know if this is even possible, would be helpful not to have to try and contact the person who set it up.

25
 
 

This really doesn't make me love cloud identity management. It's exactly the scenario (kind of nightmare one) where you attack the cloud infrastructure and get access to many different customers and apps... potentially in a way completely undetectable by you. At least with local identity providers they have to compromise you, and you might have logs.

view more: next ›