Home c# How to Purriously Purred With Anglesharp?

How to Purriously Purred With Anglesharp?

Author

Date

Category

Hello! Many topics have already reread similar, but they mostly asked them. I very much agree with These words. Well, honestly, nothing is not clear. I realized that the information from HTML parses either through LINQ or through CSS selectors. With the first is not familiar at all, CSS know superficially. But still such an option is intuitively, if it is closer, so I would like to receive answers in the form of CSS selectors.

Immediately a question: Does the whole possible info be repressed with both ways? Or is there only cases when only one of the ways works? Or are there any cases where it is impossible?)

Now directly to the task. I want to panify contact data from the homeford site. For example, take here this page. Parsu full page to start

var parser = new htmlparser ();
var doc = parser.parse ("Reserv");

How, for example, panic a name? I watch the source, I see that the name is in the block
div class = "df_panel" . It seems to be this block with a unique name, so you can narrow the search

var div = doc.queryselector ("div.df_panel");

Here the questions immediately begin. It myself figured out that if the name of the class is indicated in the div block, it is written as it is provided. If, for example, div = "Test" , then the request is written differently (for a long time it reached a bunch of examples from different forums)

var div = doc.queryselector ("div [id =" test ");

That’s where something is written about this? I understand that there are some regular expressions here. Maybe they are similar to some other parsers, as, for example, written here that anglesharp is very similar to Fizzler. But what if I have it a locally problem, and not with what other parsers did I really have something? How should I understand what exactly me to write?

Okay, distracting. Diva closest for narrowing the circle of search received. (Again distracted by the way, and how to be if it were not at all? Is it possible to somehow get certain data if there are no unique identifiers by which the search for the desired value is narrows gradually?). Total we see that the name is written in the header tag & lt; h6 itemprop = "name" & gt; the necessary name & lt; / h6 & gt; . How to get this value? It would be possible to pull the name if it were recorded at all without a header tag?

So far, to set the questions for this. I would be grateful for any explanation. It is advisable to very much to get answers to more general questions (for example, about how I assume these regular expressions with helpe or good examples), then I can deal with the rest of the rest.


Answer 1, Authority 100%

most important

First-invoor, you need to learn CSS selectors.
And for a better understanding, at least still the foundations of HTML.
You can do this, for example, on HTML ACADEMY . Is free.

I would also add that no magic is not – all the anglesharp samples are
Standard CSS selectors, and not something unusual. (c) Reinraus

I will answer your question:

but what if I have a locally problem, and not with what
I did not matter other parses? How should I understand what exactly me
write?

To understand what you need to write, I repeat, you need to learn CSS selectors.
You can do it, for example, here: “Do you know selectors? “.

Give here a brief pinch from the above article:

Basic Types of Selectors

The main types of selectors are only somewhat:

* – any items.
Div – elements with such a tag.
#id – element with ID
.
.class – elements with this class.
[Name = "Value"] – selectors on the attribute (see below).
: Visited – “Pseudoclasss”, other different conditions on the element (see below).

selectors can be combined, recording sequentially, without a space:

.c1.c2 – elements simultaneously with two classes C1 and C2
a # id.c1.c2: Visited – element a with data id classes c1 and C2 , and pseudolass Visited

Relationship

CSS3 provides four types of relationships between elements.

The most famous you probably know:

div p P elements, which are descendants of div .
Div & GT; P – Only direct descendants
There are two more rare:

Div ~ P – Right Neighbors: All P At the same level of nesting, which go after Div .
div + p – the first right neighbor: P at the same level of nesting, which goes immediately after div (if any).

attractors attributes

on the entire attribute:

  • [attr] – attribute set,
  • [attr = "Val"] – the attribute is Val .

At the beginning of the attribute:

  • [attr ^ = "Val"] – the attribute starts with Val , for example value .
  • [attr | = "val"] – attribute is val or starts with val - , for example, is VAL-1 .
    For maintenance:

  • [attr * = "Val"] – the attribute contains a subscript val , for example, is MyValue .

  • [attr ~ = "Val"] – the attribute contains Val as one of the values ​​via a space.
    For example: [attr ~ = "delete"] True for Edit Delete and incorrectly for undelete , and still incorrectly for No-delete .

At the end of the attribute:

  • [attr = "val"] – the attribute ends on Val , for example, is Myval .

Where to practice?

css diner – here you need to choose an item corresponding to the specified CSS rule.

HTML ACADEMY – Here you can explore the basics of layout.
HTMLBook – CSS Directory selectors and HTML tags.

Answers to other questions

That’s where something is written about this? I understand that here
Some regular expressions are applied.

These are not regular expressions, but CSS selectors. I wrote about this above.

Is it possible to somehow get certain data if not
unique identifiers by which it narrows gradually
Search area of ​​the desired value?

Yes, combining child elements on any basis. For example, the first descendant in the parent (* & gt; *: first-child ), exactly the second part of the P inside your parent (P: NTH-Child ( 2) ), not empty a A: NOT (: EMPTY) ) and so on.

Total we see that the name is written to the header tag & lt; H6 itemprop = "name" & gt; the desired name & lt; / h6 & gt; . How to get the
Is this a value? It would be possible to pull out the name if it was recorded
Generally without tag header?

If I understood correctly, and I mean the value that is not wrapped in any tag, then the answer will still be – yes. You can search by text.
For this particular case, the solution in the form of a selector will be: h6 [ItemPROP = "NAME"]

As soon as I have not tried. Not Parsit With .df_panel

Your code from comments uses the QuerySelector method, which, as I understand it, selects the first element with the specified selector. The first .df_panel of six on the page does not contain the H6 element. Therefore, you do not find anything. We emphasize once again: on the page>six elements .df_panel .

code for sampling you need

varseller = doc.queryselector ("[ItemPROP = 'SELLER']");
var name = Seller.QuerySelector ("[ItemPROP = 'Name']"). text ();

Answer 2

but what if I have a locally problem, and not with what
I did not matter other parses? How should I understand what exactly me
write?

Above answered about what you need to know CSS selectors, however, since the question glows in search engines (there are not so many questions on RU-SO) – I will add a useful link on the topic.

In the official Anglesharp repositories on Githabe there is a demo application .

Programmers, Start Your Engines!

Why spend time searching for the correct question and then entering your answer when you can find it in a second? That's what CompuTicket is all about! Here you'll find thousands of questions and answers from hundreds of computer languages.

Recent questions