Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This isn't "odd" behavior. It's a consequence of using a multibyte encoding scheme. Also, when dealing with case mapping, you can't assume that the character count will remain constant. This is because in Unicode full case mappings can map a character to multiple characters, meaning you might end up with more characters than you started with, regardless of the encoding used.


That's exactly right. My comment here is related:

https://news.ycombinator.com/item?id=42018937




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: