Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryError #2

Open
vinnitu opened this issue Dec 22, 2016 · 7 comments
Open

MemoryError #2

vinnitu opened this issue Dec 22, 2016 · 7 comments

Comments

@vinnitu
Copy link

vinnitu commented Dec 22, 2016

I have 64G memory + 64G swap, but...

Is it ok?

47590536
answer:   'To listen to the audio turn off the.....'
question: 'To listen to the audio turn off the.....'

47590536
answer:   'leader, who sent out fund-raising.......'
question: 'leader, who sent out fund-raising.......'

Vectorization...
X = np_zeros
Traceback (most recent call last):
  File "keras_spell.py", line 302, in <module>
    main_news()
  File "keras_spell.py", line 296, in main_news
    X_train, X_val, y_train, y_val, y_maxlen, ctable = vectorize(questions, answers, chars)
  File "keras_spell.py", line 97, in vectorize
    X = np_zeros((len_of_questions, x_maxlen, len(chars)), dtype=np.bool)
MemoryError
@MajorTal
Copy link
Owner

Nope. Not ok. In a later revision of the code which I have not yet released we moved to batches (new feature from Keras) that enable us to train without holding the entire data in memory.
For now, limit the amount of lines read in the read_news function. Start with 10K?

@vgoklani
Copy link

The one-hot feature encoding takes a lot of memory, perhaps trying to use a sparse representation.

thanks @MajorTal for releasing the code, I replicated your results, but used CHARS to avoid the memory explosion.

@MajorTal
Copy link
Owner

CHARS? Please elaborate.

@vinnitu
Copy link
Author

vinnitu commented Dec 26, 2016

about swap... It seems swap is not used at all

@netsh4rk
Copy link

netsh4rk commented Jan 2, 2017

Hi, thank you so much for sharing this!
I've a similar problem as @vinnitu, python exit with "Killed" error, I guess for memory too...
So I'm starting the training with 10K examples and began the training correctly.

Now I'm at the first Iteration (ok, is only the first one), but is it ok that kind of results at this step or there is something wrong?

--------------------------------------------------
Iteration 1
Train on 33229 samples, validate on 3693 samples
Epoch 1/5
33229/33229 [==============================] - 4233s - loss: 2.9667 - acc: 0.2423 - val_loss: 2.6717 - val_acc: 0.2999
Epoch 2/5
33229/33229 [==============================] - 4299s - loss: 2.6127 - acc: 0.3124 - val_loss: 2.5249 - val_acc: 0.3222
Epoch 3/5
33229/33229 [==============================] - 4335s - loss: 2.5537 - acc: 0.3208 - val_loss: 2.4999 - val_acc: 0.3258
Epoch 4/5
33229/33229 [==============================] - 4225s - loss: 2.5241 - acc: 0.3248 - val_loss: 2.5127 - val_acc: 0.3242
Epoch 5/5
33229/33229 [==============================] - 4073s - loss: 2.5072 - acc: 0.3270 - val_loss: 2.4975 - val_acc: 0.3263
Q English icty of Coventry in 2008, was...
A English city of Coventry in 2008, was...
X toe                              ee.....
---
Q He hoped leftwingers would respect his..
A He hoped leftwingers would respect his..
X The                               e.....
---
Q "The unexpected rebound will help to....
A "The unexpected rebound will help to....
X toe                              e......
---
Q straight games..........................
A straight games..........................
X tee       eee...........................
---
Q 1,234 U.S. residents age 21 and older,..
A 1,234 U.S. residents age 21 and older,..
X The                               e.....
---
Q W edon't even know what's in the next...
A We don't even know what's in the next...
X toe                              ee.....
---
Q to quit.................................
A to quit.................................
X teeee...................................
---
Q to give these answers as the ploice.....
A to give these answers as the police.....
X toe                             e.......
---
Q Uzoh and Quinton Ross, swingmyan........
A Uzoh and Quinton Ross, swingman.........
X toe                          e..........
---
Q her bank accont had been a repository...
A her bank account had been a repository..
X toe                              ee.....
---

I've added only these patches to the code due to some error on utf-8 encoding

schermata 2017-01-02 alle 18 31 03

but I guess didn't broken anything...

I'm trying on a "poor" Mac i7, later on I'll move to AWS p2.xlarge but first of all I would like to see if everything is working properly.
Thank you!
Best

Luca

@MajorTal
Copy link
Owner

MajorTal commented Jan 3, 2017

It actually looks ok for a first iteration. May the network be ever converging in your favor.

@netsh4rk
Copy link

netsh4rk commented Jan 5, 2017

Ok thank you @MajorTal !
I would like to train on Italian language corpus.
Do you think Google 5-gram Italian corpus could be ok to train ?
It's not so easy to find millions of news corpus in Italian to train...
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants