I'm trying to pass a string parameter that has Korean characters. This causes an error, because Korean characters are apparently not properly encoded/decoded before it is passed to open()
built-in function.
I wrote a command then executed it with os.system()
which is equivalent to running it on the command prompt.
command = 'hwp5txt "C:\\Users\\username\\VSCodeProjects\\myproject\\data_files\\some_folder\\hwp\\2020-01-17_-_한국어가포함된 파일명(2020년도 제1차).hwp"> testdoc.txt'
os.system(command)
This throws an error because Korean characters are not properly decoded.
Traceback (most recent call last): File "C:\Users\username\AppData\Local\pypoetry\Cache\virtualenvs\asiae-bok-nlp-xpMr0EW7-py3.7\Scripts\hwp5txt-script.py", line 11, in load_entry_point('pyhwp==0.1b12', 'console_scripts', 'hwp5txt')() File "c:\users\username\appdata\local\pypoetry\cache\virtualenvs\asiae-bok-nlp-xpmr0ew7-py3.7\lib\site-packages\hwp5\hwp5txt.py", line 102, in main
with closing(Hwp5File(hwp5path)) as hwp5file: File "c:\users\username\appdata\local\pypoetry\cache\virtualenvs\asiae-bok-nlp-xpmr0ew7-py3.7\lib\site-packages\hwp5\filestructure.py", line 537, in init stg = Hwp5FileBase(stg) File "c:\users\username\appdata\local\pypoetry\cache\virtualenvs\asiae-bok-nlp-xpmr0ew7-py3.7\lib\site-packages\hwp5\filestructure.py", line 188, in init stg = OleStorage(stg) File "c:\users\username\appdata\local\pypoetry\cache\virtualenvs\asiae-bok-nlp-xpmr0ew7-py3.7\lib\site-packages\hwp5\storage\ole.py", line 35, in init self.impl = impl_class(*args, **kwargs) File "c:\users\uesrname\appdata\local\pypoetry\cache\virtualenvs\asiae-bok-nlp-xpmr0ew7-py3.7\lib\site-packages\hwp5\plat\olefileio.py", line 112, in init if not isOleFile(olefile): File "c:\users\username\appdata\local\pypoetry\cache\virtualenvs\asiae-bok-nlp-xpmr0ew7-py3.7\lib\site-packages\olefile\olefile.py", line 309, in isOleFile with open(filename, 'rb') as fp: OSError: [Errno 22] Invalid argument: 'C:\Users\username\VSCodeProjects\asiae-BOK-nlp\data_files\BOK_minutes\hwp\2020-01-17_-_??????? ???(2020?? ?1?).hwp'
As you can see, OS Error
was raised because the command I sent to the prompt somehow didn't manage to pass the right Korean characters, which is now ?????
instead of its proper name.
I tried it on the terminal manually but it also fails.
How do I pass string characters that is not properly passed to the module?
I'm using the latest version of VSCode with Git Bash terminal.
Also, I can check this information. If you need further information, please comment.
sys.stdout.encoding
>> 'UTF-8'
sys.stdin.encoding
>> 'cp1252'
sys.getfilesystemencoding
>> 'UTF-8'