Replace all occurrences of C11137159 in file3.csv with XXXXXXXXX and save the modified content in file3_mod.csv.
Command to execute the substitution:
$ sed 's/C11137159/XXXXXXXXX/' file3.csv > file3_mod.csv
Content of file3.csv before the substitution: | Content of the file after the substitution: |
---|---|
$ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3_mod.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 XXXXXXXXX,Group3,13 C11137167,Group3,16 XXXXXXXXX,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
In the following example, instead of replacing a fix string as we did in example 1, we will replace a group of characters (i.e. all upper-case characters in the file) by a single character or another group of characters (i.e. replace with lower-case characters). The groups of characters that can be used are listed in the following table:
Expression | Group of characters |
---|---|
[:alnum:] | Letters and digits |
[:alpha:] | Letters |
[:blank:] | Horizontal white space |
[:cntrl:] | Control characters |
[:digit:] | Digits |
[:graph:] | Printable characters, excluding space |
[:lower:] | Lower-case letters |
[:print:] | Printable characters, including space |
[:punct:] | Punctuation characters |
[:space:] | Horizontal or vertical white space |
[:upper:] | Upper-case letters |
[:xdigit:] | Hexadecimal digits |
Replace all upper-case letters in file3.csv by lower-case.
Original content of the file | Content after replacements |
---|---|
$ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3.csv | tr '[:upper:]' '[:lower:]' anonymized id,subject group,age c11138122,missing,21 c11138192,group1,54 b12226507,group1,68 b12226546,group1,67 c11138122,group1,24 c11138184,group1,59 c11138797,group1,22 c11138152,group1,53 c11138150,group1,41 c11137167,group3,14 c11137159,group3,13 c11137167,group3,16 c11137159,group3,13 c11131039,group2 b,67 c11135566,group2 b,73 b11119903,group2 b,83 c11137544,group1,21 c11137443,group3,11 c11137123,group2 b,69 c11137439,group3,79 c11137439,group3,15 c11133100,group1,23 d11144030,group3,13 b11108399,group1,23 b11108326,group1,59 b11119909,group2 b,61 b11110893,group1,28 |
Replace all lower-case letters in file3.csv by upper-case.
Original content of the file | Content after replacements |
---|---|
$ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3.csv | tr '[:lower:]' '[:upper:]' ANONYMIZED ID,SUBJECT GROUP,AGE C11138122,MISSING,21 C11138192,GROUP1,54 B12226507,GROUP1,68 B12226546,GROUP1,67 C11138122,GROUP1,24 C11138184,GROUP1,59 C11138797,GROUP1,22 C11138152,GROUP1,53 C11138150,GROUP1,41 C11137167,GROUP3,14 C11137159,GROUP3,13 C11137167,GROUP3,16 C11137159,GROUP3,13 C11131039,GROUP2 B,67 C11135566,GROUP2 B,73 B11119903,GROUP2 B,83 C11137544,GROUP1,21 C11137443,GROUP3,11 C11137123,GROUP2 B,69 C11137439,GROUP3,79 C11137439,GROUP3,15 C11133100,GROUP1,23 D11144030,GROUP3,13 B11108399,GROUP1,23 B11108326,GROUP1,59 B11119909,GROUP2 B,61 B11110893,GROUP1,28 |
Replace all alphabetical characters in file3.csv by the number 0
Original content of the file | Content after replacements |
---|---|
$ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3.csv | tr '[:alpha:]' 0 0000000000 00,0000000 00000,000 011138122,0000000,21 011138192,000001,54 012226507,000001,68 012226546,000001,67 011138122,000001,24 011138184,000001,59 011138797,000001,22 011138152,000001,53 011138150,000001,41 011137167,000003,14 011137159,000003,13 011137167,000003,16 011137159,000003,13 011131039,000002 0,67 011135566,000002 0,73 011119903,000002 0,83 011137544,000001,21 011137443,000003,11 011137123,000002 0,69 011137439,000003,79 011137439,000003,15 011133100,000001,23 011144030,000003,13 011108399,000001,23 011108326,000001,59 011119909,000002 0,61 011110893,000001,28 |
Replace all digits in file3.csv by the letter X.
Original content of the file | Content after replacements |
---|---|
$ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3.csv | tr '[:digit:]' X Anonymized ID,Subject Group,AGE CXXXXXXXX,MISSING,XX CXXXXXXXX,GroupX,XX BXXXXXXXX,GroupX,XX BXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX b,XX CXXXXXXXX,GroupX b,XX BXXXXXXXX,GroupX b,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX b,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX CXXXXXXXX,GroupX,XX DXXXXXXXX,GroupX,XX BXXXXXXXX,GroupX,XX BXXXXXXXX,GroupX,XX BXXXXXXXX,GroupX b,XX BXXXXXXXX,GroupX,XX |
Replace all punctuation characters in file3.csv by a space (' ')
Original content of the file | Content after replacements |
---|---|
$ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3.csv | tr '[:punct:]' ' ' Anonymized ID Subject Group AGE C11138122 MISSING 21 C11138192 Group1 54 B12226507 Group1 68 B12226546 Group1 67 C11138122 Group1 24 C11138184 Group1 59 C11138797 Group1 22 C11138152 Group1 53 C11138150 Group1 41 C11137167 Group3 14 C11137159 Group3 13 C11137167 Group3 16 C11137159 Group3 13 C11131039 Group2 b 67 C11135566 Group2 b 73 B11119903 Group2 b 83 C11137544 Group1 21 C11137443 Group3 11 C11137123 Group2 b 69 C11137439 Group3 79 C11137439 Group3 15 C11133100 Group1 23 D11144030 Group3 13 B11108399 Group1 23 B11108326 Group1 59 B11119909 Group2 b 61 B11110893 Group1 28 |
Replace all white spaces in file3.csv by an underscore '_'.
Original content of the file | Content after replacements |
---|---|
$ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3.csv | tr '[:blank:]' '_' Anonymized_ID,Subject_Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2_b,67 C11135566,Group2_b,73 B11119903,Group2_b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2_b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2_b,61 B11110893,Group1,28 |
You can also replace a range of letters or numbers:
Replace any A, B or C (letters in the range A-C) in file3.csv by the letter D.
Original content of the file | Content after replacements |
---|---|
$ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3.csv | tr 'A-C' 'D' Dnonymized ID,Subject Group,DGE D11138122,MISSING,21 D11138192,Group1,54 D12226507,Group1,68 D12226546,Group1,67 D11138122,Group1,24 D11138184,Group1,59 D11138797,Group1,22 D11138152,Group1,53 D11138150,Group1,41 D11137167,Group3,14 D11137159,Group3,13 D11137167,Group3,16 D11137159,Group3,13 D11131039,Group2 b,67 D11135566,Group2 b,73 D11119903,Group2 b,83 D11137544,Group1,21 D11137443,Group3,11 D11137123,Group2 b,69 D11137439,Group3,79 D11137439,Group3,15 D11133100,Group1,23 D11144030,Group3,13 D11108399,Group1,23 D11108326,Group1,59 D11119909,Group2 b,61 D11110893,Group1,28 |
Replace A by W, B by X, C by Y, and D by Z in file3.csv (replace letters in the range A-D with letters in the range W-Z)
Original content of the file | Content after replacements |
---|---|
$ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3.csv | tr 'A-D' 'W-Z' Wnonymized IZ,Subject Group,WGE Y11138122,MISSING,21 Y11138192,Group1,54 X12226507,Group1,68 X12226546,Group1,67 Y11138122,Group1,24 Y11138184,Group1,59 Y11138797,Group1,22 Y11138152,Group1,53 Y11138150,Group1,41 Y11137167,Group3,14 Y11137159,Group3,13 Y11137167,Group3,16 Y11137159,Group3,13 Y11131039,Group2 b,67 Y11135566,Group2 b,73 X11119903,Group2 b,83 Y11137544,Group1,21 Y11137443,Group3,11 Y11137123,Group2 b,69 Y11137439,Group3,79 Y11137439,Group3,15 Y11133100,Group1,23 Z11144030,Group3,13 X11108399,Group1,23 X11108326,Group1,59 X11119909,Group2 b,61 X11110893,Group1,28 |
The command tr doesn't allow you to replace by an empty space in order to delete a character or a set of characters, but you can use the -d flag for deletion.
Remove all spaces in file3.csv.
Original content of the file | Content after replacements |
---|---|
$ cat file3.csv $ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3.csv | tr -d '[:blank:]' AnonymizedID,SubjectGroup,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2b,67 C11135566,Group2b,73 B11119903,Group2b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2b,61 B11110893,Group1,28 |
In order to delete any repeated (continuous) character or sequence use the -s flag.
Remove any repeated characters ([:alnum:]) in file3.csv.
Original content of the file | Content after replacements |
---|---|
$ cat file3.csv Anonymized ID,Subject Group,AGE C11138122,MISSING,21 C11138192,Group1,54 B12226507,Group1,68 B12226546,Group1,67 C11138122,Group1,24 C11138184,Group1,59 C11138797,Group1,22 C11138152,Group1,53 C11138150,Group1,41 C11137167,Group3,14 C11137159,Group3,13 C11137167,Group3,16 C11137159,Group3,13 C11131039,Group2 b,67 C11135566,Group2 b,73 B11119903,Group2 b,83 C11137544,Group1,21 C11137443,Group3,11 C11137123,Group2 b,69 C11137439,Group3,79 C11137439,Group3,15 C11133100,Group1,23 D11144030,Group3,13 B11108399,Group1,23 B11108326,Group1,59 B11119909,Group2 b,61 B11110893,Group1,28 |
$ cat file3.csv | tr -s '[:alnum:]' Anonymized ID,Subject Group,AGE C13812,MISING,21 C138192,Group1,54 B126507,Group1,68 B126546,Group1,67 C13812,Group1,24 C138184,Group1,59 C138797,Group1,2 C138152,Group1,53 C138150,Group1,41 C137167,Group3,14 C137159,Group3,13 C137167,Group3,16 C137159,Group3,13 C131039,Group2 b,67 C1356,Group2 b,73 B1903,Group2 b,83 C13754,Group1,21 C13743,Group3,1 C137123,Group2 b,69 C137439,Group3,79 C137439,Group3,15 C1310,Group1,23 D14030,Group3,13 B10839,Group1,23 B108326,Group1,59 B1909,Group2 b,61 B10893,Group1,28 |