212
submitted 9 months ago by narwhal@lemmy.ml to c/technology@lemmy.ml
top 50 comments
sorted by: hot top controversial new old
[-] breadsmasher@lemmy.world 51 points 9 months ago

Interesting how they have kept their ops team the same but now run an entire datacentre.

Overworked teams? I just can’t see how this is possible.

Not defending cloud hosting/costs etc. You generally pay more for cloud to then not have to deal with hardware maintenance, datacentre management. I didn’t see this directly in their post. Other than keeping the same size Ops team

[-] aard@kyu.de 58 points 9 months ago

I'm running both physical hardware and cloud stuff for different customers. The problem with maintaining physical hardware is getting a team of people with relevant skills together, not the actual work - the effort is small enough that you can't justify hiring a dedicated network guy, for example, and same applies for other specialities, so you need people capable of debugging and maintaining a wide variety of things.

Getting those always was difficult - and (partially thanks to the cloud stuff) it has become even more difficult by now.

The actual overhead - even when you're racking the stuff yourself - is minimal. "Put the server in the rack and cable it up" is not hard - my last rack was filled by a high school student in a part of an afternoon, after explaining once how to cable and label everything. I didn't need to correct anything - which is a better result than many highly paid people I've worked with...

So paying for remote hands in the DC, or - if you're big enough - just order complete racks with racked and pre-cabled servers gets rid of the "put the hardware in".

Next step is firmware patching and bootstrapping - that happens automatically via network boot. After that it's provisioning the containers/VMs to run on there - which at this stage isn't different from how you'd provision it in the cloud.

You do have some minor overhead for hardware monitoring - but you hopefully have some monitoring solution anyway, so adding hardware, and maybe have the DC guys walk past and inform you of any red LEDs isn't much of an overhead. If hardware fails you can just fail over to a different system - the cost difference to cloud is so big that just having those spare systems is worth it.

I'm not at all surprised by those numbers - about two years ago somebody was considering moving our stuff into the cloud, and asked us to do some math. We'd have ended up paying roughly our yearly hardware budget (including the hours spent on working with hardware we wouldn't have with a cloud) to host a single of one of our largest servers in the cloud - and we'd have to pay that every year again, while with our own hardware and proper maintenance planned we can let old servers we paid for years ago slowly age out naturally.

[-] breadsmasher@lemmy.world 7 points 9 months ago

Thank you for the very detailed response!

[-] killeronthecorner@lemmy.world 17 points 9 months ago

They're using a third party called deft to manage the hardware. Which is a reasonable middleground between cloud and self-operated, the more I think about it.

I haven't seen a lot of info on what the cost of that management is though but it's likely to be leagues less than AWS/GCP

[-] chiisana@lemmy.chiisana.net 11 points 9 months ago

It’s not just the hardware. “The cloud is expensive” is usually touted by people not understanding why managed services (like Aurora RDS and OpenSearch as suggested in the article) ‘cost more than running it themselves’ by not accounting the management costs.

A database service needs management not only in hardware (I.e. replace dead drives) but also in software (I.e. monitor cluster performance, tweak system settings to fit usage pattern, manage cluster health, etc etc). These management requires time from the ops team, often in multiple roles like SysAdmin, DBA, and Ops engineers. Fact that they claim to have moved to their own hardware without being on new talents to their ops team makes it questionable as to whether or not they actually understand the cost and If they’re overworking their existing ops team.

[-] sugar_in_your_tea@sh.itjust.works 4 points 9 months ago

Or it could be that they haven't run into problems yet. If you overbuild your hardware or your software is efficient enough, you don't need as much tweaking.

It's questionable, but I don't think implausible.

[-] chiisana@lemmy.chiisana.net 4 points 9 months ago

“yet” is the keyword there for sure. It’s not a matter of if, but a matter of when. Even if they’re flushed with cash and grossly over provision their systems, sooner or later, a huge vulnerability will roll around and someone will need to setup / update the OS, ensuring quorum is available for their cluster, fail over traffic during update windows, etc etc etc.

The stacks are getting so insurmountably huge, it’s not possible to just drop a new cluster at their described scale without significantly increasing the workload for an existing team.

[-] scytale@lemm.ee 3 points 9 months ago

Yup. By moving out, they already let go of a lot of security services that came with their cloud subscription like CASB, automated patching, DB maintenance, security/network monitoring, etc. You have to replace all of that with people and on-prem tools/systems.

[-] makingStuffForFun@lemmy.ml 6 points 9 months ago

Warning. This site claims you've been blocked and asks for your email to verify you. Do not provide it. Reloaded and it worked. Just be safe out there

[-] killeronthecorner@lemmy.world 5 points 9 months ago

That isn't happening for me, nor has it ever when I've visited DHH's blog. It's possible your browser is compromised.

[-] makingStuffForFun@lemmy.ml 2 points 9 months ago

I have strong privacy settings enabled. I believe it might be because they can't fingerprint me or similar, so are checking for bot activity

[-] killeronthecorner@lemmy.world 5 points 9 months ago

That seems extremely unlikely, and almost unheard of. If I wget the page I'm a container, I get the same as in browser, so that would suggest this isn't the case.

[-] breadsmasher@lemmy.world 5 points 9 months ago

This didn’t happen for me

[-] books@lemmy.world 3 points 9 months ago
[-] RickRussell_CA@lemmy.world 6 points 9 months ago

"An entire data center" is 8 rented racks in two enterprise data centers (4 racks in each). They're paying $60K/month for racks, cooling, and location.

[-] notabot@lemm.ee 31 points 9 months ago

That's the thing, 'cloud' is just another tool in your toolbox. It's the right tool for some workloads and the wrong one for others. The fact they've shifted the work to their own servers and kept the ops team suggests it was the wrong sort of workload to be in the cloud in the first place.

For a while there was an obsession with moving everything to the cloud, and that was always going to be an expensive mistake in a number of different ways. Hopefully, as the hype dies down more nuanced decisions will be made. There's a whole gamut of options between all in the cloud and all in the data centre, and when people jump straight from one end to the other I'm put in mind of Hamlet's quote "There are more things in heaven and earth, Horatio, / Than are dreamt of in your philosophy." Understand your workload, understand your business' future plans and their needs, and then make a plan, considering all the tools at your disposal.

[-] Aceticon@lemmy.world 5 points 9 months ago* (last edited 9 months ago)

If there's anything that 3 decades in Tech have taught me is that fad-following commonly rules it, even with the supposedly logical (but not really) techies.

Cloud storage and cloud computing became a fad about a decade ago (I still remember the hype repeated by people who had never actually designed distruted systems) so there were tons of people jumping headfirst without a plan into it for the hype and the seemingly cheaper price (if you didn't think your needs and future evolution through) even though it wasn't the best choice for them.

No doubt well see the same kind of fad-following over making-sense-for-us thing with the latest hype-train: AI.

[-] Pieisawesome@lemmy.world 5 points 9 months ago

I hate the obsession to move to the cloud and the obsession towards serverless or functions.

Functions are stupid and crazy for anything that is actually used often.

For small utilities, they make a ton of sense, but next time I see an app with millions of requests per day using functions, I'm going to lose my mind.

[-] Aceticon@lemmy.world 2 points 9 months ago* (last edited 9 months ago)

Years ago I was the senior techie in designing and implementing distributed high performance server systems and what you reminded me of just made my blood start to boil... :/

[-] skullgiver@popplesburger.hilciferous.nl 2 points 9 months ago* (last edited 7 months ago)

[This comment has been deleted by an automated system]

[-] dan1101@lemm.ee 26 points 9 months ago* (last edited 9 months ago)

What always kept me off the "cloud" (other people's computers) is not only giving up my data but giving up control on what I spend. Corporations lure you in with flashy promises and low prices, then usually over time the service gets worse the prices go higher and higher. I'm sure the cloud hosting corporations are good at pricing their services very high but not quite high enough to make most customers cancel.

[-] Aceticon@lemmy.world 5 points 9 months ago* (last edited 9 months ago)

Lock-in is quite an old strategy in Tech (back in the day Microsoft's dominance was built on it) and apparently every new generation needs to learn their lesson...

[-] dan1101@lemm.ee 2 points 9 months ago

That's true, back in the 1970s and 1980s IBM locked companies in with mainframes and PCs were their way out.

load more comments (1 replies)
[-] cyclohexane@lemmy.ml 23 points 9 months ago

Exiting cloud being useful seems to be a very narrow use case.

For one, you have to be at a large enough scale where buying and hosting your own infra is feasible and cheaper.

Second, you have to give up the ability to almost instantly scale up or provision hardware in response to traffic or other events. (which is very common at scale)

Maybe his use case happens to be that very narrow case, but this isn't something I would take as general advice.

[-] skullgiver@popplesburger.hilciferous.nl 10 points 9 months ago* (last edited 7 months ago)

[This comment has been deleted by an automated system]

[-] evranch@lemmy.ca 3 points 9 months ago

Your last paragraph is why we've heavily used the cloud here in rural Canada for years.

Monitoring data is much easier to push into the cloud and read from there than it is hope for a reliable connection to a farm or rural plant.

Self-hosted services need to be cloud hosted for uptime and because it was getting ever harder to get a routed IPv4 address from any provider. IPv6 is nice to finally have, but Starlink is the only provider at all supporting it and it's only been a few months at that. Their prefixes change constantly too, come on guys get your shit together.

Even basic remote access systems require a VPS or VPN cloud service as you always need both ends to punch out through layers of CGNAT. Now we can finally have one end available through IPv6 but the remote user is often trying to use a IPv4 CGNAT network to connect... So you still need something in the cloud to punch holes.

Can't believe it's been over 20 years for the IPv6 rollout

[-] skullgiver@popplesburger.hilciferous.nl 4 points 9 months ago* (last edited 7 months ago)

[This comment has been deleted by an automated system]

load more comments (2 replies)
[-] flumph@programming.dev 8 points 9 months ago

DHH is a contrarian. Any benefits of the cloud he might get are overridden by the fact that he needs to be different (and blog about it).

See his stances on Typescript, workplace inclusion, TDD, etc.

[-] bahmanm@lemmy.ml 22 points 9 months ago

This is quite intriguing. But DHH has left so many details out (at least in that post) as pointed out by @breadsmasher@lemmy.world - it makes it difficult to relate to.

On the other hand, like DHH said, one's mileage may vary: it's, in many ways, a case-by-case analysis that companies should do.

I know many businesses shrink the OPs team and hire less experienced OPs people to save $$$. But just to forward those saved $$$ to cloud providers. I can only assume DDH's team is comprised of a bunch of experienced well-payed OPs people who can pull such feats off.

Nonetheless, looking forward to, hopefully, a follow up post that lays out some more details. Pray share if you come across it 🙏

[-] slazer2au@lemmy.world 6 points 9 months ago

This is part of a series of posts he has done about find out his cloud bill was stupid high because they do computationally heavy software and switching over to collocation. But the whole going from 100% cloud to colo and saving that much money is not to be scoffed at.

He does say this is an outlier and others won't get as much roi as they have.

[-] palitu@aussie.zone 2 points 9 months ago

there are a number of blog posts that have different details about the how/why, etc. i just followed the links in the article to other parts of the series.

I expect that the use case is more prevalent than you think, where you are spending a decent chunk on cloud infra. I have been convinced for some time now that the costs are high compared to our on-prem. I really like the idea of a the "deft" type hardware management service, so that look after the DCs, hardware and connectivity, and we look after the software.

[-] fubarx@lemmy.ml 9 points 9 months ago

Hopefully, they place their servers at 2x the historical peak floodpoint. Or set up standby zones in different geographies in case there's a power or network outage.

Came upon several projects where folks hadn't...

[-] sylver_dragon@lemmy.world 7 points 9 months ago

Having your compute in "the cloud" doesn't remove the need for a good backup strategy, it just changes how it works. Yes, disaster recover for natural disasters should be easier (OHV's fire showed that this may not always be true). But, that doesn't cover cases like ransomware, insider threats, data mistakes or any other case where data is corrupted/modified by mistake. You still need a plan for these cases. And cloud based backups actually make a lot of sense.

But, just because you put your backups in the cloud, doesn't mean that your compute should be there as well. There is an advantage that your Time to Recovery is likely lower with both backups and compute in the same cloud. But, is that worth the ongoing cost of running your compute in the cloud? That needs to be considered separately. You also need to consider the cost of running on-prem versus in the cloud. If you have fairly predictable, static loads, it may be cheaper to buy and run servers yourself. For hard to predict, elastic loads, cloud may make more financial sense.

As others have said before, there was a period where companies were just going to the cloud for the sole reason that it was the popular thing to do. For some it actually made financial sense. For some, it didn't. The OP's article seems to be the latter.

[-] KevonLooney@lemm.ee 2 points 9 months ago

Exactly. Use cloud for off-site backup and things that need flexibility.

You don't need any of that to run a basic website. You can almost use an old laptop or PC for most static applications.

load more comments (1 replies)
[-] repungnant_canary@lemmy.world 2 points 9 months ago

So how then people using this *miraculous and incredibly safe * (/s) cloud lost their data in OVH datacenter fire?

[-] progandy@feddit.de 3 points 9 months ago

They used the cheap option without geographic mirrors.

load more comments (4 replies)
[-] ours@lemmy.film 1 points 9 months ago

That was a data center, not a cloud. The sort of place they are moving to from the cloud.

With a cloud solution, you make sure to use services that are redundant. AWS and Azure build each region (geographical location) with **multiple **interconnected independent data centers (availability zones). High durability is one of the strong use cases for public clouds.

load more comments (1 replies)
[-] KingThrillgore@lemmy.ml 9 points 9 months ago
load more comments (1 replies)
[-] Ubermeisters@lemmy.zip 8 points 9 months ago

Yeah ok well when you get ransomware'd you're going to wish you had Cloud backups.

Ask me how I know

[-] t7tis@lemmy.ml 34 points 9 months ago

There are also many organizations that wish they has some local backups after their cloud service providers lost all their data. Lesson to learn: Backup properly with offline storage. Tape in a safe, maybe even off-site, etc.

[-] sylver_dragon@lemmy.world 9 points 9 months ago

So, what you're saying is that, regardless of where you run your workloads, you should still follow the 3-2-1 rule?

3 - copies of the data. 2 - different media. 1 - offsite.

It's funny how cloud doesn't really change the basics of good systems administration.

[-] Ubermeisters@lemmy.zip 6 points 9 months ago

Almost like a responsible modern day approach is multifaceted

[-] PortugalSpaceMoon@infosec.pub 4 points 9 months ago
[-] Ubermeisters@lemmy.zip 2 points 9 months ago

An earwig told me

[-] SirEDCaLot@lemmy.fmhy.net 8 points 9 months ago

I've been saying this for a long time.

There are use cases for the cloud. I put e-mail in the cloud- ain't nobody got time to deal with providing reliable SMTP or Exchange while keeping spam out. If you have a web app that needs to scale quickly, cloud's the way. If you're a startup with limited capital and you don't want to blow it on a bunch of servers when you're not sure if you'll survive more than a year or so, cloud's the way.

But Cloud ISN'T the end-all answer for everything.

If you have a predictable workload, especially one that relies on more expensive cloud services, de-clouding can save you a bundle. Buying hardware can be cheaper than renting it, if only because (think about it) the cloud provider has to buy the same hardware and rent it to you AND make a profit. If you're going to be around a while, and you expect to use a piece of hardware for its full service life, that makes a lot of sense.

[-] PowerCrazy@lemmy.ml 2 points 9 months ago

As long as you realize that the "cloud" is someone else's computer, it is a very viable way of hosting your service. However as your service grows all those micro services that your cloud provider charges you for will grow as well. Eventually you'll get to the point where "data transfer" costs begins to make up >50% of your total cloud spend. At that point (or ideally before) you should have a plan to stop expanding your cloud footprint, because that cost grows geometrically with the size of your cloud data and the number of cloud functions you are using on your data.

Remember Data has Weight. If you don't understand what that means, you aren't ready to make a cost comparison between cloud-hosting and data center hosting.

[-] notabot@lemm.ee 1 points 9 months ago

That's the thing, 'cloud' is just another tool in your toolbox. It's the right tool for some workloads and the wrong one for others. The fact they've shifted the work to their own servers and kept the ops team suggests it was the wrong sort of workload to be in the cloud in the first place.

For a while there was an obsession with moving everything to the cloud, and that was always going to be an expensive mistake in a number of different ways. Hopefully, as the hype dies down more nuanced decisions will be made. There's a whole gamut of options between all in the cloud and all in the data centre, and when people jump straight from one end to the other I'm put in mind of Hamlet's quote "There are more things in heaven and earth, Horatio, / Than are dreamt of in your philosophy." Understand your workload, understand your business' future plans and their needs, and then make a plan, considering all the tools at your disposal.

load more comments
view more: next ›
this post was submitted on 18 Sep 2023
212 points (96.9% liked)

Technology

33638 readers
221 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS