JSoup ile HTML Ayrıştırma

March 13, 2018

Posted by İbrahim

Permalink

İbrahim

Permalink

Selamün Aleyküm;

JSoup ile bu sitenin anasayfasının html bilgilerini almaya çalıştım. Şu kısmı istediğim gibi elde edemedim:

<td class="lastpost">Son mesaj 2 hafta önce<br> <a href="post/13594">2018 D anketi</a></td></tr>

Şu şekilde almak istiyorum:

String metin = "Son mesaj 2 hafta önce";
String sonKonu = "2018 D anketi";
String link = "post/13594";

Bu da kodum:

try {
           Document doc = Jsoup.connect("http://ddili.org/forum/").get();
           Elements threads = doc.select("div.forum_content.forum_forum");

           for (Element thread : threads) {
               String lastPost = thread.select("td.lastpost").text(); // - Aslında yanlış
               System.out.println(lastPost);
           }

       } catch (IOException e) {
           e.printStackTrace();
       }

Sonuç:

String lastPost = "Son mesaj 2 hafta önce 2018 D anketi";

Fakat nasıl almam gerektiğini yapamadım. Bunun için nasıl bir yol izlemeliyim?

Bir sitenin HTML kodlarını ayrıştırırken web tarayıcısının "Öğeyi İncele" özelliğiyle işimizi yapabiliyoruz ama bu yöntemden daha kolay HTML ayrıştırmamızı sağlayan yöntemler / araçlar vs. var mıdır?
Teşekkürler.

--
[ Bu gönderi, http://ddili.org/forum'dan dönüştürülmüştür. ]

for (int i = 1; i < rows.size(); i++) { //first row is the col names so skip it. Element row = rows.get(i); Elements cols = row.select("td"); if (cols.get(7).text().equals("down")) { downServers.add(cols.get(5).text()); } }

Forums