nokoのブログ

こちらは暫定のメモ置き場ですので悪しからず

AtCoderTagsから、ABCのE問題でよく出題されているカテゴリーを雑に集計してみる

はじめに

  • いつかE問題も解けるようになりたいなー
  • -> E問題ってどのカテゴリーの問題がよく出ているんだろう
  • -> AtCoder Tagsという素晴らしいサイトがあったので、雑にスクレイピングして集計してみた

結果(新ABCのみ集計。126-163)

A問題

[('Easy', 35), ('String', 2), ('Searching', 1)]

f:id:noko_htn:20200425171506p:plain

B問題

[('Easy', 23), ('String', 5), ('Searching', 3), ('Mathematics', 3), ('Ad-Hoc', 2), ('Construct', 1), ('Greedy-Methods', 1)]

f:id:noko_htn:20200425171539p:plain

C問題

[('Mathematics', 9), ('Searching', 9), ('Ad-Hoc', 5), ('Easy', 3), ('Construct', 3), ('Greedy-Methods', 3), ('Technique', 2), ('Data-Structure', 2), ('Dynamic-Programming', 1), ('String', 1)]

f:id:noko_htn:20200425171558p:plain

D問題

[('Mathematics', 10), ('Searching', 7), ('Ad-Hoc', 5), ('Graph', 4), ('Data-Structure', 4), ('Dynamic-Programming', 3), ('Technique', 3), ('Construct', 1), ('Greedy-Methods', 1)]

f:id:noko_htn:20200425171628p:plain

E問題

[('Dynamic-Programming', 11), ('Mathematics', 8), ('Searching', 5), ('Data-Structure', 4), ('Graph', 3), ('Construct', 2), ('Greedy-Methods', 2), ('String', 1), ('Technique', 1), ('Ad-Hoc', 1)]

f:id:noko_htn:20200425171651p:plain

F問題

[('Dynamic-Programming', 12), ('Graph', 5), ('Mathematics', 5), ('Data-Structure', 3), ('Geometry', 3), ('Construct', 2), ('Searching', 2), ('String', 2), ('Technique', 2), ('Ad-Hoc', 2)]

f:id:noko_htn:20200425171712p:plain

スクリプト

# !pip install requests
# !pip install beautifulsoup4
import requests
from bs4 import BeautifulSoup
import re
import ast

start_contest = 126
end_contest = 163
problem_num = '_e'
max_tag_list = []
for i in range(start_contest, end_contest+1):
    # http get
    url = 'https://atcoder-tags.herokuapp.com/check/abc' + str(i) + problem_num
    print('url: ' + url)
    response = requests.get(url)

    # parse
    soup = BeautifulSoup(response.text, "html.parser")

    # get the relevant part
    lines = str(soup).splitlines()
    for line in lines:
        if line.find("var dict") >= 0:
            tag_dict_str_raw = line

    r = re.compile( '(%s.*%s)' % ('{','}'))
    tag_dict_str_re = r.search(tag_dict_str_raw)
    tag_dict_str = ''
    if m is not None:
        tag_dict_str = tag_dict_str_re.group(0)

    # str->dict
    tag_dict = ast.literal_eval(tag_dict_str)

    # get max key
    max_tag = max(tag_dict, key=tag_dict.get)
    max_tag_list.append(max_tag)

# aggregate results
import collections
counter = collections.Counter(max_tag_list)
count = counter.most_common()
print(count)

# visualization
# !pip install matplotlib
import matplotlib.pyplot as plt
%matplotlib inline

x = [x[1] for x in count][:10]
label = [label[0] for label in count][:10]

plt.figure(figsize=(5,5), dpi=100)
plt.pie(x, labels=label, startangle=90, counterclock=False, autopct='%.1f%%', pctdistance=0.75)
plt.axis('equal') 
plt.show()

おわりに

  • DP頑張ります。