I have to do some find and replace tasks on a rather big file , about 47 GB in size .
Does anybody know how to do this ? I tried using services like TextCrawler , EditpadLite and more but nothing supports this large a file .
I'm assuming this can be done via the commandline .
Do you have an idea how this can be accomplished ?
Best Alternatives to Notepad++ on Mac OS X. Best Alternatives to Notepad++ on Mac OS X. Cross-file Regex search, scripting, and can edit lots of text files. This makes this program perfect for the coders who expect a lot from their editing software or will be doing a lot of work on it. TextMate is an exclusively Mac text editor that.
6 Answers
Sed (stream editor for filtering and transforming text) is your friend.
Sed performs text transformations in a single pass.
RyanRyanI use FART - Find And Replace Text by Lionello Lunesu.
It works very well on Windows Seven x64.
You can find and replace the text using this command:
On Unix or Mac:
sed 's/oldstring/newstring/g' oldfile.txt > newfile.txt
fast and easy...
For me none of the tools suggested here work well. Textcrawler ate all my computer's memory, SED didn't work at all, Editpad complained about memory...
The solution is: create your own script in python, perl or even C++.
Or use the tool PowerGrep, this is the easiest and fastest option.
I have't tried fart, it's only command line and maybe not very friendly.
Some hex editor, such as Ultraedit also work well.
If you are using a Unix like system then you can use cat | sed to do this
Example replaces com with net in a list of domain names and then you can pipe the output to a file.
I used
to replace all the instances of n's in my 7Gb file.
If I omitted the > newfile.fasta
aspect it took ages as it scrolled up the screen showing me every line of the file.
With the > newfile
it ran it in a matter of seconds on an ubuntu server
Not the answer you're looking for? Browse other questions tagged command-linereplace or ask your own question.
i have a large csv file to edit (tens of thousands lines) , the task is to edit it 'friendly' anf fast,so i would like to use PyQt QTableView to hold it, there are also some filter required, basically i need to do some 'update price = 200 where name = 'Jack' operation,
i've came up with few options but wonder if we could just combine the advantages
update: the code was on workstation and could not touch internet, i'll write psudo code instead, sorry
1, import into local sqlite
it's more than a minute to the disk and less than 10 seconds to memory, with a QSqlTableModel, it's relative fast, and adding a filter is easy, the output are not totally displayed until it's scrolled, which is good
2, just parse csv into model and display, i referencedpyqt - populating QTableWidget with csv data with the answer of user1006989
just put cell into QStandardItem, the Model/View works fine, but load large files is too slow ( about 20 seconds here ), and don't know how to implement a filter ( if we just skip when load into model, how can i write data back )
3, command line replace
I've implement both option 1 and 2, it's not very quick but might be acceptable, here I wonder if compose a Perl-like regex replacement could help, ( we need to see it's original value first )
proposed work flow is
regex search A> display / populate in table / model B> accept edit C> prepare a regex replace
which came up with a solution that i do not have to load the full content into database or populate into widget, this should be even faster, while still equiped with a filter,but i got some problem on C>
if the filter gots 4 record and i edit one of them, just backup the filter result and then diff, and then prepare a whole line prelacement ? ( each line is unique, there are some primary key things inside and won't change )
Hope some one could review my thoughts or give some advice,
Thanks a lot for your time :)
1 Answer
finally i did as the 3rd way, but 'grep files into a smaller result' was also did in python cause it's windows,
it's faster, it's less than 1 second to grep and left than 2 second for each changed line, still do not have a good way to do multiple replacement in file
which means in cureent version, if
i have to loop the whole large file for B > B1, even there is only one match, and then D1 and E1, there should be a more complex regex replace