1. jsoup 라이브러리 다운
2. 샘플 코드 (args = http://www.google.co.kr)
package org.jsoup.examples;
import org.jsoup.Jsoup;
import org.jsoup.helper.Validate;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
/**
* Example program to list links from a URL.
*/
public class ListLinks {
public static void main(String[] args) throws IOException {
Validate.isTrue(args.length == 1, "usage: supply url to fetch");
String url = args[0];
print("Fetching %s...", url);
Document doc = Jsoup.connect(url).get();
Elements links = doc.select("a[href]");
Elements media = doc.select("[src]");
Elements imports = doc.select("link[href]");
print("\nMedia: (%d)", media.size());
for (Element src : media) {
if (src.tagName().equals("img"))
print(" * %s: <%s> %sx%s (%s)",
src.tagName(), src.attr("abs:src"), src.attr("width"), src.attr("height"),
trim(src.attr("alt"), 20));
else
print(" * %s: <%s>", src.tagName(), src.attr("abs:src"));
}
print("\nImports: (%d)", imports.size());
for (Element link : imports) {
print(" * %s <%s> (%s)", link.tagName(),link.attr("abs:href"), link.attr("rel"));
}
print("\nLinks: (%d)", links.size());
for (Element link : links) {
print(" * a: <%s> (%s)", link.attr("abs:href"), trim(link.text(), 35));
}
}
private static void print(String msg, Object... args) {
System.out.println(String.format(msg, args));
}
private static String trim(String s, int width) {
if (s.length() > width)
return s.substring(0, width-1) + ".";
else
return s;
}
}
3. 결과
Fetching http://www.google.com...
Media: (2)
* img: <http://www.google.co.kr/images/icons/product/chrome-48.png> x ()
* img: <http://www.google.co.kr/textinputassistant/tia.png> 27x23 ()
Imports: (0)
Links: (21)
* a: <http://www.google.co.kr/imghp?hl=ko&tab=wi> (이미지)
* a: <http://maps.google.co.kr/maps?hl=ko&tab=wl> (지도)
* a: <https://play.google.com/?hl=ko&tab=w8> (Play)
* a: <http://www.youtube.com/?gl=KR&tab=w1> (YouTube)
* a: <http://news.google.co.kr/nwshp?hl=ko&tab=wn> (뉴스)
* a: <https://mail.google.com/mail/?tab=wm> (Gmail)
* a: <https://drive.google.com/?tab=wo> (드라이브)
* a: <http://www.google.co.kr/intl/ko/options/> (더보기 »)
* a: <http://www.google.co.kr/history/optout?hl=ko> (웹 기록)
* a: <http://www.google.co.kr/preferences?hl=ko> (설정)
* a: <https://accounts.google.com/ServiceLogin?hl=ko&continue=http://www.google.co.kr/%3Fgfe_rd%3Dcr%26ei%3DtvutVPb5NsG6kAXVy4CwDA> (로그인)
* a: <http://www.google.co.kr/chrome/index.html?hl=ko&brand=CHNG&utm_source=ko-hpp&utm_medium=hpp&utm_campaign=ko> (Chrome 다운로드)
* a: <http://www.google.co.kr/advanced_search?hl=ko&authuser=0> (고급검색)
* a: <http://www.google.co.kr/language_tools?hl=ko&authuser=0> (언어도구)
* a: <http://www.google.co.kr/intl/ko/ads/> (광고 프로그램)
* a: <http://www.google.co.kr/intl/ko/services/> (비즈니스 솔루션)
* a: <https://plus.google.com/102197601262446632410> (+Google)
* a: <http://www.google.co.kr/intl/ko/about.html> (Google 정보)
* a: <http://www.google.co.kr/setprefdomain?prefdom=US&sig=0_zqOlB8Ip0P4chzkeavWzxYifIMQ%3D> (Google.com)
* a: <http://www.google.co.kr/intl/ko/policies/privacy/> (개인정보 보호)
* a: <http://www.google.co.kr/intl/ko/policies/terms/> (약관)
4. jsoup api document
5. cookbook
'학습자료 > Java' 카테고리의 다른 글
[java] timetask (0) | 2015.01.08 |
---|---|
[java] 파일 실행 (0) | 2015.01.08 |
[java] proxy setting (0) | 2014.04.05 |
Installing software' has encountered a problem. 이클립스 문제 (0) | 2013.05.23 |
[java] java.library.path - linux, eclipse (0) | 2013.03.19 |