I’m writing a program that wraps around dd to try and warn you if you are doing anything stupid. I have thus been giving the man page a good read. While doing this, I noticed that dd supported all the way up to Quettabytes, a unit orders of magnitude larger than all the data on the entire internet.

This has caused me to wonder what the largest storage operation you guys have done. I’ve taken a couple images of hard drives that were a single terabyte large, but I was wondering if the sysadmins among you have had to do something with e.g a giant RAID 10 array.

  • Davel23
    arrow-up
    82
    arrow-down
    0
    ·
    2 months ago
    link
    fedilink

    Not that big by today’s standards, but I once downloaded the Windows 98 beta CD from a friend over dialup, 33.6k at best. Took about a week as I recall.

    • pete_the_catEnglish
      arrow-up
      31
      arrow-down
      0
      ·
      2 months ago
      link
      fedilink

      I remember downloading the scene on American Pie where Shannon Elizabeth strips naked over our 33.6 link and it took like an hour, at an amazing resolution of like 240p for a two minute clip 😂

    • absGeekNZEnglish
      arrow-up
      17
      arrow-down
      0
      ·
      2 months ago
      link
      fedilink

      Yep, downloaded XP over 33.6k modem, but I’m in NZ so 33.6 was more advertising than reality, it took weeks.

    • 50MYT
      arrow-up
      1
      arrow-down
      0
      ·
      2 months ago
      link
      fedilink

      In similar fashion, downloaded dude where’s my car, over dialup, using at the time the latest tech method - a file download system that would split the file into 2mb chunks and download them in order.

      It took like 4 days.

  • UristEnglish
    arrow-up
    63
    arrow-down
    1
    ·
    2 months ago
    link
    fedilink

    I obviously downloaded a car after seeing that obnoxious anti-piracy ad.

  • freijon
    arrow-up
    60
    arrow-down
    0
    ·
    2 months ago
    link
    fedilink

    I’m currently backing up my /dev folder to my unlimited cloud storage. The backup of the file /dev/random is running since two weeks.

    • Eager EagleEnglish
      arrow-up
      13
      arrow-down
      0
      ·
      2 months ago
      link
      fedilink

      That’s silly. You should compress it before uploading.

    • Mike1576218
      arrow-up
      9
      arrow-down
      0
      ·
      2 months ago
      link
      fedilink

      No wonder. That file is super slow to transfer for some reason. but wait till you get to /dev/urandom. That file hat TBs to transfer at whatever pipe you can throw at it

    • Norah - She/TheyEnglish
      arrow-up
      6
      arrow-down
      0
      ·
      2 months ago
      link
      fedilink

      Cool, so I learned something new today. Don’t run cat /dev/random

      • mvirts
        arrow-up
        1
        arrow-down
        0
        ·
        2 months ago
        link
        fedilink

        Why not try /dev/urandom?

        😹

        • Norah - She/TheyEnglish
          arrow-up
          2
          arrow-down
          0
          ·
          2 months ago
          link
          fedilink

          Ya know, if not for the other person’s comment, I might have been gullible enough to try this

      • PlexSheep
        arrow-up
        3
        arrow-down
        0
        ·
        2 months ago
        link
        fedilink

        /dev/random and other “files” in /dev are not really files, they are interfaces which van be used to interact with virtual or hardware devices. /dev/random spits out cryptographically secure random data. Another example is /dev/zero, which spits out only zero bytes.

        Both are infinite.

        Not all “files” in /dev are infinite, for example hard drives can (depending on which technology they use) be accessed under /dev/sda /dev/sdb and so on.

        • data1701d (He/Him)OPEnglish
          arrow-up
          1
          arrow-down
          0
          ·
          2 months ago
          link
          fedilink

          I’m aware of that. I was quite sure the author was joking, with the slightest bit of concern of them actually making the mistake.

  • Neuromancer49English
    arrow-up
    48
    arrow-down
    1
    ·
    2 months ago
    link
    fedilink

    In grad school I worked with MRI data (hence the username). I had to upload ~500GB to our supercomputing cluster. Somewhere around 100,000 MRI images, and wrote 20 or so different machine learning algorithms to process them. All said and done, I ended up with about 2.5TB on the supercomputer. About 500MB ended up being useful and made it into my thesis.

    Don’t stay in school, kids.

  • fuckwit_mcbumcrumbleEnglish
    arrow-up
    42
    arrow-down
    0
    ·
    2 months ago
    link
    fedilink

    Entire drive/array backups will probably be by far the largest file transfer anyone ever does. The biggest I’ve done was a measly 20TB over the internet which took forever.

    Outside of that the largest “file” I’ve copied was just over 1TB which was a SQL file backup for our main databases at work.

    • cbarrickEnglish
      arrow-up
      9
      arrow-down
      0
      ·
      2 months ago
      link
      fedilink

      +1

      From an order of magnitude perspective, the max is terabytes. No “normal” users are dealing with petabytes. And if you are dealing with petabytes, you’re not using some random poster’s program from reddit.

      For a concrete cap, I’d say 256 tebibytes

  • Taleya
    arrow-up
    36
    arrow-down
    1
    ·
    2 months ago
    link
    fedilink

    I work in cinema content so hysterical laughter

    • potajito
      arrow-up
      14
      arrow-down
      0
      ·
      2 months ago
      link
      fedilink

      Interesting! Could you give some numbers? And what do you use to move the files? If you can disclose obvs

      • Taleya
        arrow-up
        24
        arrow-down
        0
        ·
        2 months ago
        edit-2
        2 months ago
        link
        fedilink

        A small dcp is around 500gb. But that’s like basic film shizz, 2d, 5.1 audio. For comparison, the 3D deadpool 2 teaser was 10gb.

        Aspera’s commonly used for transmission due to the way it multiplexes. It’s the same protocolling behind Netflix and other streamers, although we don’t have to worry about preloading chunks.

        My laughter is mostly because we’re transmitting to a couple thousand clients at once, so even with a small dcp thats around a PB dropped without blinking

          • Dlayknee
            arrow-up
            11
            arrow-down
            0
            ·
            2 months ago
            link
            fedilink

            Digital Cinema Package; basically the movie file you’re watching when you’re in a movie theater.

          • Taleya
            arrow-up
            4
            arrow-down
            0
            ·
            2 months ago
            link
            fedilink

            Digital Cinema Package. Films come out in a buncha files that rather resemble a dvd rip. You got your video files (still called reels!) and your audio files, maybe some subtitle files and other bits and pieces and your assetmap (list of files) all in a big fat folder collectively called a DCP

        • MoonMelonEnglish
          arrow-up
          6
          arrow-down
          0
          ·
          2 months ago
          link
          fedilink

          In the early 2000s I worked on an animated film. The studio was in the southern part of Orange County CA, and the final color grading / print (still not totally digital then) was done in LA. It was faster to courier a box of hard drives than to transfer electronically. We had to do it a bunch of times because of various notes/changes/fuck ups. Then the results got courier’d back because the director couldn’t be bothered to travel for the fucking million dollars he was making.

          • CrabAndBroomEnglish
            arrow-up
            4
            arrow-down
            0
            ·
            2 months ago
            link
            fedilink

            Oh yeah I worked in animation for a bit too. Those 4K master files are no joke lol

          • WldFyre
            arrow-up
            4
            arrow-down
            0
            ·
            2 months ago
            link
            fedilink

            You legally have to tell us if that movie was Shrek.

            • MoonMelonEnglish
              arrow-up
              3
              arrow-down
              0
              ·
              2 months ago
              link
              fedilink

              Hah, nope. Shrek was made in Glendale, so they probably had everything on site or right next door.

          • Taleya
            arrow-up
            3
            arrow-down
            0
            ·
            2 months ago
            link
            fedilink

            Fucking hell the raws woulda been gigantic

        • daq
          arrow-up
          3
          arrow-down
          0
          ·
          2 months ago
          link
          fedilink

          I used to work in the same industry. We transferred several PBs from West US to Australia using Aspera via thick AWS pipes. Awesome software.

          • Taleya
            arrow-up
            1
            arrow-down
            0
            ·
            2 months ago
            edit-2
            2 months ago
            link
            fedilink

            Hahahah did you enjoy Australian Internet? It’s wonderfully archaic

            (MPS, Delux, Gofilex or Qubewire?)

        • potajito
          arrow-up
          3
          arrow-down
          0
          ·
          2 months ago
          link
          fedilink

          Ahhh thanks for the reply! Makes sense! We also use Aspera here at work (videogames) but dont move that ammount, not even close.

  • ramble81
    arrow-up
    32
    arrow-down
    0
    ·
    2 months ago
    link
    fedilink

    I’ve done a 1PB sync between a pair of 8-node SAN clusters as one was being physically moved since it’d be faster to seed the data and start a delta sync rather than try to do it all over a 10Gb pipe. M

    • jetEnglish
      arrow-up
      9
      arrow-down
      0
      ·
      2 months ago
      link
      fedilink

      I’m in the same boat, just under 3PiB

  • Hugin
    arrow-up
    25
    arrow-down
    0
    ·
    2 months ago
    link
    fedilink

    It was something around 40 TB X2 . We were doing a terrain analysis of the entire Earth. Every morning for 25 days I would install two fresh drives in the cluster doing the data crunching and migrate the filled drives to our file server rack.

    The drives were about 80% full and our primary server was mirrored to two other 50 drive servers. At the end of the month the two servers were then shipped to customer locations.

  • pixeltree
    arrow-up
    21
    arrow-down
    0
    ·
    2 months ago
    link
    fedilink

    I once deleted an 800 gb log file, does that count

    • LoulouEnglish
      arrow-up
      8
      arrow-down
      1
      ·
      2 months ago
      link
      fedilink

      Depends, did you send it to the trash can first?

  • Trigger2_2000
    arrow-up
    18
    arrow-down
    0
    ·
    2 months ago
    link
    fedilink

    I once abused an SMTP relay (my own) by emailing Novell a 400+ MB memory dump. Their FTP site kept timing out.

    After all that, and them swearing they had to have it, the OS team said “Nope, we’re not going to look at it”. Guess how I feel about Novell after that?

    This was in the mid-90’s.

  • d00phyEnglish
    arrow-up
    18
    arrow-down
    0
    ·
    2 months ago
    link
    fedilink

    I’ve migrated petabytes from one GPFS file system to another. More than once, in fact. I’ve also migrated about 600TB of data from D3 tape format to 9940.

  • brygphilomena
    arrow-up
    18
    arrow-down
    0
    ·
    2 months ago
    link
    fedilink

    In the middle of something 200tb for my Plex server going from a 12 bay system to a 36 LFF system. But I’ve also literally driven servers across the desert because it was faster than trying to move data from one datacenter to another.

      • Norah - She/TheyEnglish
        arrow-up
        2
        arrow-down
        0
        ·
        2 months ago
        link
        fedilink

        Just thinking about how much data you could transfer using this. MicroSD cards makes it a decent amount. Latency would be horrible, but throughput could be pretty good I think.

        • ayyy
          arrow-up
          2
          arrow-down
          0
          ·
          2 months ago
          link
          fedilink

          Amazon Snowball will send you a semi truck.

  • neidu2
    arrow-up
    15
    arrow-down
    0
    ·
    2 months ago
    edit-2
    2 months ago
    link
    fedilink

    I don’t remember how many files, but typically these geophysical recordings clock in at 10-30 GB. What I do remember, though, was the total transfer size: 4TB. It was kind of like a bunch of .segd, and they were stored in this server cluster that was mounted in a shipping container for easy transport and lifting onboard survey ships. Some geophysics processors needed it on the other side of the world. There were nobody physically heading in the same direction as the transfer, so we figured it would just be easier to rsync it over 4G. It took a little over a week to transfer.

    Normally when we have transfers of a substantial size going far, we ship it on LTO. For short distance transfers we usually run a fiber, and I have no idea how big the largest transfer job has been that way. Must be in the hundreds of TB. The entire cluster is 1.2PB, bit I can’t recall ever having to transfer everything in one go, as the receiving end usually has a lot less space.

  • Decency8401
    arrow-up
    12
    arrow-down
    0
    ·
    2 months ago
    link
    fedilink

    A few years back I worked at a home. They organised the whole data structure but needed to move to another Providor. I and my colleagues moved roughly just about 15.4 TB. I don’t know how long it took because honestly we didn’t have much to do when the data was moving so we just used the downtime for some nerd time. Nerd time in the sense that we just started gaming and doing a mini LAN party with our Raspberry and banana pi’s.

    Surprisingly the data contained information of lots of long dead people which is quiet scary because it wasn’t being deleted.

    • 🐍🩶🐢English
      arrow-up
      4
      arrow-down
      0
      ·
      2 months ago
      link
      fedilink

      No idea about which specific type of business it is, but keeping that history long term can have some benefits, especially to outside people. Some government agencies require companies to keep records for a certain number of years. It could also help out in legal investigations many years in the future and show any auditors you keep good records. From a historical perspective, it can be matched to census, birth, and death certificates. A lot of generational history gets lost.

      Companies also just hoard data. Never know what will be useful later. shrug