更新日: 2014年9月8日
開発記録

Twitpic画像を一括ダウンロード保存するRuby製コマンドラインツールを作りました！

Twitpic が9/25にサービス終了するということで、自分のアカウントを調べた所、意外にも300枚以上の画像がアップロードされててびっくり。Twitter 初期の頃は、けっこう使っていたのですよね。

Twitpic is shutting down | Twitpic Blog

ということで、コマンドラインから利用する Ruby 製の、Twitpic 画像一括ダウンロードツールを作りました。Ruby2.1.2で動作確認済みです。

— 動作確認 —
Ruby 2.1.2
Mac OS X Lion 10.7.5

Twitpic downloader with Ruby

おそらくRuby2.0.*以上なら動くと思います。1.9.* でも多分動くかと。1.8.* は分かりません ><

【お知らせ】英単語を画像イメージで楽に暗記できる辞書サイトを作りました。英語学習中の方は、ぜひご利用ください！

画像付き英語辞書 Imagict | 英単語をイメージで暗記
【開発記録】
英単語を画像イメージで暗記できる英語辞書サービスを作って公開しました

ダウンロードツールの使い方

上記 GitHub Gist のリンクから、適当なディレクトリにダウンロードツールを保存した後、以下の手順で利用します。


$ mkdir work_dir
$ ruby twitpic_downloader.rb twitpic_username work_dir

$ mkdir work_dir

$ ruby twitpic_downloader.rb twitpic_username work_dir

あらかじめ作業ディレクトリ（例では work_dir）を作成しておきます。その後、twitpic_username（Twitpicユーザー名）と work_dir（作業ディレクトリ名）を引数で渡して、ruby スクリプトを実行すると、画像のダウンロードが開始されます。

work_dir/images に Twitpic の全画像が保存されます。私の Twitpic アカウントの316枚の画像をダウンロード保存するのに、8分30秒ほどかかりました。

ダウンロードツールのコード

このダウンロードツールのRubyスクリプトは、”いいハコ作ろう”さんが公開されていたシェルスクリプトを参考にしました。ありがとうございます。

Twitpicの画像をMacで一括ダウンロード – いいハコ作ろう
 Twitpic whole images downloader for mac

なお、ログファイルの出力など少々端折っています。

twitpic_downloader.rb

# Twitpic downloader with Ruby
#
# This tool enables you to save all your twitpic full-size images.
# Confirmed this tool working with Ruby 2.1.2.
# 
# Usage
# $ mkdir work_dir
# $ ruby twitpic_downloader.rb user_name work_dir
#
# MIT License
# Copyright (c) 2014 Takafumi Yamano

require 'date'
require 'open-uri'

# prepare for saving images
USER_NAME = ARGV[0].to_s
WORK_DIR = ARGV[1].to_s
IMG_SAVE = 1
PREFIX = "twitpic-#{USER_NAME}"

if USER_NAME.empty?
  puts "Error: You must supply your twitpic USER_NAME."
  exit
end

unless Dir.exists?(WORK_DIR)
  puts "Error: You must create the WORK_DIR beforehand."
  exit
end

Dir.mkdir "#{WORK_DIR}/images" unless Dir.exists?("#{WORK_DIR}/images")
Dir.mkdir "#{WORK_DIR}/html" unless Dir.exists?("#{WORK_DIR}/html")

# download twitpic html pages
page = 1

while true
  puts "page: #{page}"
  input_url = "http://twitpic.com/photos/#{USER_NAME}?page=#{page}"
  output_file = "#{WORK_DIR}/html/#{PREFIX}-page-#{page}.html"

  unless File.exists?(output_file)
    puts "download html: #{input_url}"
    open(output_file, 'w') do |output|
      open(input_url, 'r') do |html_data|
        output.write(html_data.read)
      end
    end
  end

  break unless File.read(output_file) =~ /Next/
  page += 1
end

# extract all image ids from downloaded html pages
image_ids = []
Dir.glob("#{WORK_DIR}/html/#{PREFIX}-page-*").each do |file|
  image_ids.push File.read(file).scan(/<a href="\/([a-zA-Z0-9]+)">/).flatten
end

image_ids = image_ids.flatten.uniq.delete_if{|i| i == "sopapipa"}.sort

# download twitpic html pages of full size images
image_ids.each_with_index do |id, index|
  puts "#{index+1}: #{id}"

  full_url = "http://twitpic.com/#{id}/full"
  full_file = "#{WORK_DIR}/html/#{PREFIX}-#{id}-full.html"

  unless File.exists?(full_file)
    puts "download full url: #{full_url}"
    open(full_file, 'w') do |output|
      open(full_url, 'r') do |html_data|
        output.write(html_data.read)
      end
    end
  end
end

# extract all full image urls
full_image_urls = {}
image_ids.each do |id|
  file = "#{WORK_DIR}/html/#{PREFIX}-#{id}-full.html"
  full_image_urls[id] = File.read(file).scan(/<img src="([^"]*)"/).flatten.grep(/(https:\/\/[^"]*)/){|i| $1}[0]
end

# download full images
unless IMG_SAVE == 1
  puts "Warning: Didn't save full size images yet."
  puts "Warning: Change IMG_SAVE to 1 in oreder to save full images."
  exit
end

full_image_urls.each_with_index do |(id, url), index|
  puts "#{index+1}: #{id}"
  next if url.to_s.empty?
  extension = url.scan(/\.([a-zA-Z]+)\?[0-9]+\z/).flatten[0]
  full_image_file = "#{WORK_DIR}/images/#{PREFIX}-#{id}-full.#{extension}"

  unless File.exists?(full_image_file)
    puts "save full image: #{url}"
    begin
      open(full_image_file, 'wb') do |output|
        open(url, 'rb') do |image_data|
          output.write(image_data.read)
        end
      end
    rescue
      next
    end
  end
end

100

101

102

103

104

105

106

107

108

109

110

111

112

113

# Twitpic downloader with Ruby

# This tool enables you to save all your twitpic full-size images.

# Confirmed this tool working with Ruby 2.1.2.

# Usage

# $ mkdir work_dir

# $ ruby twitpic_downloader.rb user_name work_dir

# MIT License

require 'date'

require 'open-uri'

# prepare for saving images

USER_NAME = ARGV[0].to_s

WORK_DIR = ARGV[1].to_s

IMG_SAVE = 1

PREFIX = "twitpic-#{USER_NAME}"

if USER_NAME.empty?

puts "Error: You must supply your twitpic USER_NAME."

exit

end

unless Dir.exists?(WORK_DIR)

puts "Error: You must create the WORK_DIR beforehand."

exit

end

Dir.mkdir "#{WORK_DIR}/images" unless Dir.exists?("#{WORK_DIR}/images")

Dir.mkdir "#{WORK_DIR}/html" unless Dir.exists?("#{WORK_DIR}/html")

# download twitpic html pages

page = 1

while true

puts "page: #{page}"

input_url = "http://twitpic.com/photos/#{USER_NAME}?page=#{page}"

output_file = "#{WORK_DIR}/html/#{PREFIX}-page-#{page}.html"

unless File.exists?(output_file)

puts "download html: #{input_url}"

open(output_file, 'w') do |output|

open(input_url, 'r') do |html_data|

output.write(html_data.read)

end

break unless File.read(output_file) =~ /Next/

page += 1

end

# extract all image ids from downloaded html pages

image_ids = []

Dir.glob("#{WORK_DIR}/html/#{PREFIX}-page-*").each do |file|

image_ids.push File.read(file).scan(/<a href="\/([a-zA-Z0-9]+)">/).flatten

end

image_ids = image_ids.flatten.uniq.delete_if{|i| i == "sopapipa"}.sort

# download twitpic html pages of full size images

image_ids.each_with_index do |id, index|

puts "#{index+1}: #{id}"

full_url = "http://twitpic.com/#{id}/full"

full_file = "#{WORK_DIR}/html/#{PREFIX}-#{id}-full.html"

unless File.exists?(full_file)

puts "download full url: #{full_url}"

open(full_file, 'w') do |output|

open(full_url, 'r') do |html_data|

output.write(html_data.read)

end

# extract all full image urls

full_image_urls = {}

image_ids.each do |id|

file = "#{WORK_DIR}/html/#{PREFIX}-#{id}-full.html"

full_image_urls[id] = File.read(file).scan(/<img src="([^"]*)"/).flatten.grep(/(https:\/\/[^"]*)/){|i| $1}[0]

end

# download full images

unless IMG_SAVE == 1

puts "Warning: Didn't save full size images yet."

puts "Warning: Change IMG_SAVE to 1 in oreder to save full images."

exit

end

full_image_urls.each_with_index do |(id, url), index|

puts "#{index+1}: #{id}"

next if url.to_s.empty?

extension = url.scan(/\.([a-zA-Z]+)\?[0-9]+\z/).flatten[0]

full_image_file = "#{WORK_DIR}/images/#{PREFIX}-#{id}-full.#{extension}"

unless File.exists?(full_image_file)

puts "save full image: #{url}"

begin

open(full_image_file, 'wb') do |output|

open(url, 'rb') do |image_data|

output.write(image_data.read)

end

rescue

end

Twitpic アカウントに画像が沢山あって、手元に Ruby 環境がある方はご利用下さい！

Rails と Swift はぜひ押さえておきたいです。

本気ではじめるiPhoneアプリ作り Xcode 7.x+Swift 2.x対応黒帯エンジニアがしっかり教える基本テクニック (ヤフー黒帯シリーズ)

パーフェクト Ruby on Rails

>> 次の記事 : 英単語を画像イメージで暗記できる英語辞書サービスを作って公開しました

<< 前の記事 : はてなブックマーク総数カウンター

Twitpic画像を一括ダウンロード保存するRuby製コマンドラインツールを作りました！

ダウンロードツールの使い方

ダウンロードツールのコード

Leave Your Message! コメントをキャンセル