Bash one liner to randomize lines in file
The Modern Way
As many commenters pointed out over the years, most Linux systems now ship with shuf (part of GNU coreutils):
shuf unusual.txt > randorder.txt
If your system has GNU sort, sort -R also works, though it groups duplicate lines together:
sort -R unusual.txt > randorder.txt
For maximum portability (especially on older systems or macOS without GNU coreutils), the awk approach works everywhere:
awk 'BEGIN{srand()}{print rand(),$0}' unusual.txt | sort -n | cut -d ' ' -f2- > randorder.txt
Original 2007 Post
Discovered that the bash shell has a variable called $RANDOM, which outputs a
pseudo-random number every time you call it. Sweet! Allowed me to randomize
the lines in a file for a process I needed to do, thusly:
for i in `cat unusual.txt`; do echo "$RANDOM $i"; done | sort | sed -r 's/^[0-9]+ //' > randorder.txt
In other words, put a random number on every line, sort the file, then take off the random numbers. Worked like a charm.
Note: The for loop breaks on lines with spaces. If your file has spaces, use while read instead:
while read -r line; do echo "$RANDOM $line"; done < unusual.txt | sort | sed -r 's/^[0-9]+ //' > randorder.txt
Also, sed -r is GNU sed. On macOS, use sed -E instead.
Comments
Now that's clever. I'll have to remember that.
sed: illegal option -- r
Red: I used GNU sed on Linux, what Unix are you using?
I'm using bash on osx Leopard
my bad - i made a typo.
Red: no prob :)
It occurs to me that this thing has a problem whereby if the text file you were randomizing started with numbers and blank, the regular expression could be a bit too greedy. Just a quick hack that I came up with that could be improved.
looks like it's sed -E on mac
If you really want to be sure that contents of the file (eg. you have lot of lines that start with number) don't have effect on sorting, you should do something like this:
for i in `cat unusual.txt`; do echo “$RANDOM $i”; done; sed ’s/^/0000/’ | sed ’s/^0*\([0-9]\{5\}[ ].*$\)/\1/’ | sort | sed -r ’s/^[0-9]+ //’ > randorder.txtbtw. this page if first Google hit when you search for .
why don't you just use shuf? like
shuf unusual.txt > randorder.txtWell I would say that using shuf would be quite an easier way to do it...
But just as a comment on the original method, I had a bit of trouble getting it to work when the lines of my files contained spaces, causing each seperated word to get a line of its own..
Anywho, don't know if this is just on my setup / or that particular file, but I found a way that worked, very much inspired by your command - and it is as follows:
while read -r line; do echo "$RANDOM $line"; done rand.txt
Cheers... c",)
Shuf doesn't exist on the box i have. and i can't seem to get:
for i in `cat unusual.txt`; do echo “$RANDOM $i”; done | sort | sed -r ’s/^[0-9]+ //’ > randorder.txt
keep getting the sed man page popping up.
and i don't really understand how:
while read -r line; do echo “$RANDOM $line”; done rand.txt
reads in a file in the first place? where is the input?
My bad, it wasn't pasting right into the terminal, it works. (original solution)
More efficient way (compared to the for loop)
cat unusual.txt | while read line do....
How about:
sort -R unusual.txt > random.txtor printing just one random line:
sort -R unusual.txt | tail -1nice!
sort -R isn't available in bash.
Dude, it's like the Schwartzian transformation in bash!
when I have a file with spaces where I want to iterator over each line I change the input field separator environment variable like:
$ export IFS='
'
$ for line in `cat file`
> {
> echo "$RANDOM $line"
> } | sort -n | sed -e 's/^[0-9]+ //' > random_fileJust to add another tweak...
while read i ;do echo “$RANDOM $i”; done randorder.txt
My previous comment got screwed up by the formatting...
First part: "while read i ; do echo “$RANDOM $i”; done randorder.txt"
How about:
awk 'BEGIN{srand()}{print rand(),$0}' SOMEFILE | sort -n | cut -d ' ' -f2-sort -R
:)
this command fail when the file have duplicated lines :(
Great tip!
Since I had numbers in my file, I used awk instead of sed, since I found it easier:
for i in `cat list_to_randomize.txt`; do echo "$RANDOM $i"; done | sort | awk '{print $2}' > randomized_list.txtThanks a lot for this page!
G.
none of these worked for me. i found the solution on a far away google result : (be careful despite this is the first google result , the formatting sucks and copy-paste may give errors)
awk 'BEGIN {srand()} {print int(rand()*1000000) "\t" $0}' file | sort -n | cut -f 2-sort -R will only work if the lines are unique. If there are duplicate lines, sort -R will put them next to each other.
shuf on the other hand will sufficiently randomize a list, including NOT putting duplicate lines next to each other.
this actually works for me. its simpler and uses colrm which rocks.
for i in `cat mapslist.txt`; do echo `printf "%08d" $RANDOM` $i; done | sort | colrm 1 10 >mapslist.txt.rand; mv mapslist.txt mapslist.old; mv mapslist.txt.rand mapslist.txt;"why don't you use shuf? Why not use sort -R?"
If I had shuf, or if my local /bin/sort supported -R, I would use it. But I don't, which is why I'm using something fork-y in bash.
"why don't you just get shuf?"
Because I don't have root and/or gcc on every machine I touch.
HOWEVER, with what was mentioned above, I can make shuf:
echo -e '#!/bin/bash'"\n"'while read -r LINE; do echo "$RANDOM $LINE"; done |sort |sed -r '"'s/^[0-9]+ //'" > ~/bin/shuf
chmod 0755 ~/bin/shufNow I can put |shuf into my chains of pipes.
'Shuf' and 'rl' are good options here. Here are two small tutorial on my blog:
nl -ba file | sort -R | sed 's/.*[0-9]\t//'This is probably easiest and the most portable. Basically, use sort -R, but add line numbers to the file first to make all lines unique, then remove them after the sort.
Just what I needed. OS X doesn't have rl, shuf. There is no -R option for sort. And I'm using an iBook G4 and cannot afford the space to compile them. Thanks!
cat -n unusual.txt | sort -R | cut -f2-Thanks, exactly what I needed!
sort -R
shuf
rl (requires apt-get install randomize-lines but it's pretty osm)
I think a simple command will do:
cat list.txt | sort -R > randList.txt