Re: Characters in an uploaded text file being corrupted
|[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]|
Am thinking that might be able to handle character set encoding processing a bit more simply using python, which is currently my other primary programming language, so might try out something like creating a bit of a cross platform utility app that they can try use initially to control something like this before uploading content, but, let's see...
Stay well Jacob Kruger Blind Biker Skype: BlindZA '...fate had broken his body, but not his spirit...'----- Original Message ----- From: "Carl Roett" <carlroett@xxxxxxxxx>
To: <php-windows@xxxxxxxxxxxxx> Sent: Sunday, May 27, 2012 6:47 PM Subject: Re: Characters in an uploaded text file being corrupted
Do not use Microsoft Word to save files that need to be read-in by othersoftware. Ever. It *always* finds a way to screw it up. Even if you cut andpaste it into the other program, I've heard cases where Word has put unprintable control characters into the pasted output, that showed-up as spaces in the form, then corrupted the output at runtime. I suggest you open the saved file in NetBeans. In the NetBeans settings dialogue, enable the options to "show control characters" and "process file as unicode" (iirc). That will cause NetBeans to display all characters your font set contains, display a square box for the ones it doesn't, and display the various icons for control characters. If column alignment in your file is important (for example a set number ofspaces between tokens and operators), you should be using a monospace font.The best one I've ever used is available here: http://code.google.com/p/buddypress-media/downloads/detail?name=ttf-bitstream-vera-1.10.zip It's also really easy to read. For text streams in production site, take a look at BP-Media's sanitizer classes. Unicode to ASCII to HTML entity conversion is a tricky business, and these functions will ensure you get the right kind of output no matterwhat users throw at you. For example, people embedding unicode sequences inan ASCII stream. http://code.google.com/p/buddypress-media/source/browse/bp_media/trunk/core/database/class.database.sanitizers.php ^C^ ===================================================On Sun, May 27, 2012 at 8:57 AM, Jacob Kruger <jacobk@xxxxxxxxxxxxxx> wrote:Ok, specifically told that storage field to make use of UTF8_unicode_ci, and tried re-uploading text file that specifically saved from word usingUTF8 encoding, and, the text content is still full of garbage characters..?Suppose might try something like copying/pasting from text file innotepad, into something like a textarea field, but, don't think that wouldalways be the perfect process in this site. Will try a couple of other options with regard to DB encoding, and see if come right, but, let's see... Jacob Kruger Blind Biker Skype: BlindZA '...fate had broken his body, but not his spirit...' ----- Original Message ----- From: "Niel Archer" < spam-free@xxxxxxxxxxxxxxxx> To: <php-windows@xxxxxxxxxxxxx> Sent: Sunday, May 27, 2012 4:22 PMSubject: Re: Characters in an uploaded text file being corruptedHiOk, and, FWIW, this test text file is a word document that have saved asa text file, and when now resaved it using specifically unicode encoding, it seemd to eliminate this issue,Make sure the MySQL Db, table, field are correctly set for the encoding. MySQL does not default to UTF-8 so you have to manually set it when creating the Db/table. IIRC, newer versions can also have the encoding set per field. but would have thought there might be arelatively simple way to handle something like encoding conversion in PHPitself..?PHP cannot read minds, unfortunately. There are ways to handle encoding conversions, but I don't think anyone would call them 'simple'. ;-) See the Multibyte String extension for one way. -- Niel Archer -- PHP Windows Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php-- PHP Windows Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
-- PHP Windows Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php