Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with using infer_id() #10

Open
anudeike opened this issue Sep 25, 2020 · 4 comments
Open

Error with using infer_id() #10

anudeike opened this issue Sep 25, 2020 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@anudeike
Copy link

Hi! I'm using this code for a research project, thank you for providing it.

I am trying to make an inference based infer_id nd I just replicated the example in the FAQ. Here's what my code looks like:

from m3inference import M3Twitter
load_dotenv()
# authentication twitter_app_auth = { 'consumer_key': os.getenv('TWITTER_API_KEY'), 'consumer_secret': os.getenv('TWITTER_API_SECRET'), 'access_token': os.getenv('TWITTER_ACCESS_TOKEN'), 'access_token_secret': os.getenv('TWITTER_ACCESS_SECRET'), }

# init the api inferenceTwitter.twitter_init(api_key=twitter_app_auth['consumer_key'], api_secret=twitter_app_auth['consumer_secret'], access_token=twitter_app_auth['access_token'], access_secret=twitter_app_auth['access_token_secret'])

pprint.pprint(inferenceTwitter.infer_id("2631881902"))

The traceback that I received was pretty confusing

`RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.`

RuntimeError: DataLoader worker (pid(s) 57016) exited unexpectedly

I'm not sure where to find the freeze_support() function call and how to deal with using the fork() child processes.

@zijwang
Copy link
Member

zijwang commented Sep 25, 2020

I just tried your example and it seems the code works on my end.

>>> from m3inference import M3Twitter
>>> m3twitter = M3Twitter()
09/25/2020 12:09:58 - INFO - m3inference.m3inference -   Version 1.1.0
09/25/2020 12:09:58 - INFO - m3inference.m3inference -   Running on cpu.
09/25/2020 12:09:58 - INFO - m3inference.m3inference -   Will use full M3 model.
09/25/2020 12:09:59 - INFO - m3inference.m3inference -   Model full_model exists at <...>/m3/models/full_model.mdl.
09/25/2020 12:09:59 - INFO - m3inference.utils -   Checking MD5 for model full_model at <...>/m3/models/full_model.mdl
09/25/2020 12:09:59 - INFO - m3inference.utils -   MD5s match.
09/25/2020 12:10:01 - INFO - m3inference.m3inference -   Loaded pretrained weight at <...>/m3/models/full_model.mdl
09/25/2020 12:10:01 - INFO - m3inference.m3twitter -   Dir <...>/m3/cache does not exist. Creating now.
09/25/2020 12:10:01 - INFO - m3inference.m3twitter -   Dir <...>/m3/cache created.
>>> m3twitter.twitter_init_from_file('<...>/auth_example.txt')
True
>>> m3twitter.infer_id("2631881902")
09/25/2020 12:10:08 - INFO - m3inference.m3twitter -   Results not in cache. Fetching data from Twitter for id 2631881902.
09/25/2020 12:10:08 - INFO - m3inference.m3twitter -   GET /users/show.json?id=2631881902
09/25/2020 12:10:11 - INFO - m3inference.dataset -   1 data entries loaded.
Predicting...: 100%|██████████████████████████████████████████████| 1/1 [00:07<00:00,  7.60s/it]
{'input': {'description': 'Bundeskanzlerin', 'id': '2631881902', 'img_path': '<...>2631881902_224x224.jpg', 'lang': 'de', 'name': 'Angela Merkel', 'screen_name': 'AngelaMerkeICDU'}, 'output': {'gender': {'male': 0.0015, 'female': 0.9985}, 'age': {'<=18': 0.0, '19-29': 0.0, '30-39': 0.0001, '>=40': 0.9999}, 'org': {'non-org': 0.996, 'is-org': 0.004}}}

Could you provide the full log and traceback? If you are using a GPU could you also try to run the code on CPU only?

@computermacgyver
Copy link
Member

Thanks, @anudeike , for reporting this. Could you let us know what operating system and version of Python you are using?

My guess @zijwang is that this may be a Windows-specific bug. The error mentions fork (linux/mac only) and freeze_support, which I believe is something specific about making multiprocess approaches work on Windows. I've just not run anything but Linux for so long that I'm unsure.

@anudeike
Copy link
Author

@computermacgyver Hi and you're welcome.

I am using Windows 10 and Python 3.8.

@computermacgyver computermacgyver self-assigned this Oct 27, 2020
@computermacgyver computermacgyver added the bug Something isn't working label Oct 27, 2020
@computermacgyver
Copy link
Member

Hi @anudeike ,
I've done a bit of digging on this and think this is down to how you call the library in your script. Can you try this example:
https://github.com/euagendas/m3inference/blob/win/scripts/m3twitter.py

If you follow the instructions in the README to create a file called auth.txt based on the structure of auth_example.txt in that same directory, then you should be able to run

python m3twitter.py --auth auth.txt --screen-name computermacgyve --skip-cache
10/27/2020 19:22:55 - INFO - m3inference.m3inference -   Version 1.1.1
10/27/2020 19:22:55 - INFO - m3inference.m3inference -   Running on cpu.
10/27/2020 19:22:55 - INFO - m3inference.m3inference -   Will use full M3 model.
10/27/2020 19:22:56 - INFO - m3inference.m3inference -   Model full_model exists at /home/shale/m3/models/full_model.mdl.
10/27/2020 19:22:56 - INFO - m3inference.utils -   Checking MD5 for model full_model at /home/shale/m3/models/full_model.mdl
10/27/2020 19:22:56 - INFO - m3inference.utils -   MD5s match.
10/27/2020 19:22:56 - INFO - m3inference.m3inference -   Loaded pretrained weight at /home/shale/m3/models/full_model.mdl
10/27/2020 19:22:56 - INFO - m3inference.m3twitter -   skip_cache is True. Fetching data from Twitter for computermacgyve.
10/27/2020 19:22:56 - INFO - m3inference.m3twitter -   GET /users/show.json?screen_name=computermacgyve
10/27/2020 19:23:02 - INFO - m3inference.dataset -   1 data entries loaded.
Predicting...: 100%|██████████████████████████████| 1/1 [00:00<00:00,  2.30it/s]
{'input': {'description': 'Sr Research Fellow @oiioxford, Director of Research '
                          '@meedan, Fellow @turinginst.・widening access to '
                          'quality '
                          'info・multilingualism・mobilization・NLP・agenda '
                          'setting',
           'id': '19854920',
           'img_path': '/home/shale/m3/cache/19854920_224x224.jpg',
           'lang': 'en',
           'name': 'Scott Hale',
           'screen_name': 'computermacgyve'},
 'output': {'age': {'19-29': 0.0117,
                    '30-39': 0.1219,
                    '<=18': 0.0014,
                    '>=40': 0.865},
            'gender': {'female': 0.0003, 'male': 0.9997},
            'org': {'is-org': 0.0002, 'non-org': 0.9998}}}

What you'll see in that file is that

  1. I make sure the contents of my main program are within a
if __name__ == "__main__":

block
2. The first statement of that block is freeze_support()
3. I have imported that freeze_support function from multiprocessing, i.e.,

from multiprocessing import freeze_support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants