학습자료/Java 2015. 1. 8. 12:40

1. jsoup 라이브러리 다운

http://jsoup.org/download


2. 샘플 코드 (args = http://www.google.co.kr)

package org.jsoup.examples;

import org.jsoup.Jsoup;
import org.jsoup.helper.Validate;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

import java.io.IOException;

/**
 * Example program to list links from a URL.
 */

public class ListLinks {
   
public static void main(String[] args) throws IOException {
       
Validate.isTrue(args.length == 1, "usage: supply url to fetch");
       
String url = args[0];
       
print("Fetching %s...", url);

       
Document doc = Jsoup.connect(url).get();
       
Elements links = doc.select("a[href]");
       
Elements media = doc.select("[src]");
       
Elements imports = doc.select("link[href]");

       
print("\nMedia: (%d)", media.size());
       
for (Element src : media) {
           
if (src.tagName().equals("img"))
               
print(" * %s: <%s> %sx%s (%s)",
                        src
.tagName(), src.attr("abs:src"), src.attr("width"), src.attr("height"),
                        trim
(src.attr("alt"), 20));
           
else
               
print(" * %s: <%s>", src.tagName(), src.attr("abs:src"));
       
}

       
print("\nImports: (%d)", imports.size());
       
for (Element link : imports) {
           
print(" * %s <%s> (%s)", link.tagName(),link.attr("abs:href"), link.attr("rel"));
       
}

       
print("\nLinks: (%d)", links.size());
       
for (Element link : links) {
           
print(" * a: <%s>  (%s)", link.attr("abs:href"), trim(link.text(), 35));
       
}
   
}

   
private static void print(String msg, Object... args) {
       
System.out.println(String.format(msg, args));
   
}

   
private static String trim(String s, int width) {
       
if (s.length() > width)
           
return s.substring(0, width-1) + ".";
       
else
           
return s;
   
}

}


3. 결과

Fetching http://www.google.com...


Media: (2)

 * img: <http://www.google.co.kr/images/icons/product/chrome-48.png> x ()

 * img: <http://www.google.co.kr/textinputassistant/tia.png> 27x23 ()


Imports: (0)


Links: (21)

 * a: <http://www.google.co.kr/imghp?hl=ko&tab=wi>  (이미지)

 * a: <http://maps.google.co.kr/maps?hl=ko&tab=wl>  (지도)

 * a: <https://play.google.com/?hl=ko&tab=w8>  (Play)

 * a: <http://www.youtube.com/?gl=KR&tab=w1>  (YouTube)

 * a: <http://news.google.co.kr/nwshp?hl=ko&tab=wn>  (뉴스)

 * a: <https://mail.google.com/mail/?tab=wm>  (Gmail)

 * a: <https://drive.google.com/?tab=wo>  (드라이브)

 * a: <http://www.google.co.kr/intl/ko/options/>  (더보기 »)

 * a: <http://www.google.co.kr/history/optout?hl=ko>  (웹 기록)

 * a: <http://www.google.co.kr/preferences?hl=ko>  (설정)

 * a: <https://accounts.google.com/ServiceLogin?hl=ko&continue=http://www.google.co.kr/%3Fgfe_rd%3Dcr%26ei%3DtvutVPb5NsG6kAXVy4CwDA>  (로그인)

 * a: <http://www.google.co.kr/chrome/index.html?hl=ko&brand=CHNG&utm_source=ko-hpp&utm_medium=hpp&utm_campaign=ko>  (Chrome 다운로드)

 * a: <http://www.google.co.kr/advanced_search?hl=ko&authuser=0>  (고급검색)

 * a: <http://www.google.co.kr/language_tools?hl=ko&authuser=0>  (언어도구)

 * a: <http://www.google.co.kr/intl/ko/ads/>  (광고 프로그램)

 * a: <http://www.google.co.kr/intl/ko/services/>  (비즈니스 솔루션)

 * a: <https://plus.google.com/102197601262446632410>  (+Google)

 * a: <http://www.google.co.kr/intl/ko/about.html>  (Google 정보)

 * a: <http://www.google.co.kr/setprefdomain?prefdom=US&sig=0_zqOlB8Ip0P4chzkeavWzxYifIMQ%3D>  (Google.com)

 * a: <http://www.google.co.kr/intl/ko/policies/privacy/>  (개인정보 보호)

 * a: <http://www.google.co.kr/intl/ko/policies/terms/>  (약관)



4. jsoup api document

http://jsoup.org/apidocs/


5. cookbook

http://jsoup.org/cookbook/

'학습자료 > Java' 카테고리의 다른 글

[java] timetask  (0) 2015.01.08
[java] 파일 실행  (0) 2015.01.08
[java] proxy setting  (0) 2014.04.05
Installing software' has encountered a problem. 이클립스 문제  (0) 2013.05.23
[java] java.library.path - linux, eclipse  (0) 2013.03.19
posted by cozyboy
: