You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I download the code and try to reimplement your score on VQA 2.0 set. Since my computer cannot support the whole training data, I split the - vqa_train_final.json and - coco_features.npy into 7 folds, each set of them grouped by imageid.(like vqa_train_final.0.json contains image ids : [1, 2, 3] the coco_features.0.npy contains image features of [1, 2, 3] and other sets doesnot have any data related to image [1, 2, 3]) I train the model in two way, one is loading the data from 0 to 6 in each epoch and repeat 50 times. the other is loading each data set training 50 epochs and then move on to the next data set.
However, both of them result in a low accuracy, 30% or so. the tokenized question, coco 36 features is downloaded from the link you described. what do you think might be the cause? Thanks
this is how I split the data
def split_images():
list_train = os.listdir('G:/train2014/')
list_train.remove('COCO_train2014_000000372405.jpg')
ids = [int(f[15:27]) for f in list_train]
length = int(len(ids)/7)+1
ids_list = [ids[i:i + length] for i in range(0, len(ids), length)]
for i in range(len(ids_list)):
np.savetxt("split/imageIds.train." + str(i), ids_list[i], fmt='%d')
def split_json():
train = json.load(open('vqa_train_final.json'))
for i in range(7):
ids = np.loadtxt("split/imageIds.train." + str(i)).astype(int)
s = set(ids)
data = []
for j in range(len(train)):
if train[j]['image_id'] in s:
data.append(train[j])
json.dump(data, open('split/vqa_train_final.json.' + str(i), 'w'))
for k in range(7):
ids = np.loadtxt("split/imageIds.train." + str(k)).astype(int)
s = set(ids)
in_data = {}
with open(infile, "rt") as tsv_in_file:
reader = csv.DictReader(tsv_in_file, delimiter='\t', fieldnames = FIELDNAMES)
i = 0
for item in reader:
i = i+1
if i % 1000 == 0:
print(k,i)
try:
data = {}
data['image_id'] = int(item['image_id'])
if data['image_id'] in s:
b = base64.decodestring(bytes(item['features'], encoding = "utf8"))
data['features'] = np.frombuffer(b, dtype=np.float32).reshape((36, -1))
in_data[data['image_id']] = data['features']
except:
print('error',item['image_id'])
np.save('split/coco_features.npy.train.' + str(k), in_data)
The text was updated successfully, but these errors were encountered:
I download the code and try to reimplement your score on VQA 2.0 set. Since my computer cannot support the whole training data, I split the - vqa_train_final.json and - coco_features.npy into 7 folds, each set of them grouped by imageid.(like vqa_train_final.0.json contains image ids : [1, 2, 3] the coco_features.0.npy contains image features of [1, 2, 3] and other sets doesnot have any data related to image [1, 2, 3]) I train the model in two way, one is loading the data from 0 to 6 in each epoch and repeat 50 times. the other is loading each data set training 50 epochs and then move on to the next data set.
However, both of them result in a low accuracy, 30% or so. the tokenized question, coco 36 features is downloaded from the link you described. what do you think might be the cause? Thanks
this is how I split the data
The text was updated successfully, but these errors were encountered: