User not logged in - login - register
Home Calendar Books School Tool Photo Gallery Message Boards Users Statistics Advertise Site Info
go to bottom | |
 Message Boards » » Dealing with non standard characters Page [1]  
Raige
All American
4386 Posts
user info
edit post

I have an app that people are constantly cutting and pasting from word. How do you guys deal with non standard characters such as microsoft words double quotes? Is there a specific way you capture those and convert to normal quotes?

This would also apply to any other non standard characters people cut and paste most often. Thanks!

4/11/2007 8:28:46 AM

synchrony7
All American
4462 Posts
user info
edit post

I'm not sure what you are asking exactly? Do you mean programatically and if so in what language? Are these Unicode characters or just regular ASCII characters outside the normal 0-127 range?

4/11/2007 9:35:04 AM

Raige
All American
4386 Posts
user info
edit post

They are out of the normal range. You know the typical squares you can get from cutting and pasting text from msword and putting it into an html page? Those are the characters I'm speaking of.

I've solved my solution temporarily by using FCKEditor in all text fields but I was wondering if there was a simpler solution.

[Edited on April 11, 2007 at 9:41 AM. Reason : ! damn i can't spell]

4/11/2007 9:41:40 AM

synchrony7
All American
4462 Posts
user info
edit post

You can enable HTML forms to display unicode characters (Arabic and Asian characters from other types of keyboards for example) and these special characters are just a different character set. This explains it http://www.cs.tut.fi/~jkorpela/www/windows-chars.html

4/11/2007 9:51:32 AM

Raige
All American
4386 Posts
user info
edit post

The concept is that I want to REMOVE non standard characters (IE: msword formated quotes etc) and replace them with standard versions.

4/11/2007 11:34:33 AM

Raige
All American
4386 Posts
user info
edit post

btw very useful link.

4/11/2007 11:36:49 AM

philihp
All American
8349 Posts
user info
edit post

you will have to define what the "normal" range is, and what "non-standard" characters are... and saying "non-standard characters are anything out of the normal range" doesn't count.

4/11/2007 11:59:02 AM

scud
All American
10804 Posts
user info
edit post

You are about to open a pandora's box that you probably aren't ready for - there is no quick answer to your question. The only real answer is that you have to understand the intricacies of the codepages involved

Here are some okay starting points:
http://en.wikipedia.org/wiki/Codepage
http://en.wikipedia.org/wiki/Windows-1252
http://en.wikipedia.org/wiki/ISO/IEC_8859-1


I'm going to guess that you're pasting into a Java application and running into problems converting 1252 into UCS-2

4/11/2007 11:05:31 PM

Raige
All American
4386 Posts
user info
edit post

^ That's what I thought. The only method I think if 99% surefire is using an FCKEditor text box for every single manually input field. I think this is overkill but the people using the tool want to cut and paste everything.

I appreciate the insight.

4/12/2007 9:43:24 AM

mysteryegg
Veteran
163 Posts
user info
edit post

Bill Gates is your problem!
no but seriously, Word's auto-formatting is what's introducing the characters you don't want. it's unfortunate... if you can't just stop people from inputting into Word... I'd be interested in seeing your solution

4/15/2007 6:07:42 PM

 Message Boards » Tech Talk » Dealing with non standard characters Page [1]  
go to top | |
Admin Options : move topic | lock topic

© 2024 by The Wolf Web - All Rights Reserved.
The material located at this site is not endorsed, sponsored or provided by or on behalf of North Carolina State University.
Powered by CrazyWeb v2.39 - our disclaimer.