Actual: How to import data with proper readable payee?

Atemu@lemmy.ml · 2 months ago

TS is a lot easier to set up than WG and does not require a publicly accessible IP address nor any public whatsoever. It’s not really comparable to setting WG up yourself; especially w.r.t. security.

Atemu@lemmy.ml · 2 months ago

It’s a central server (that you could actually self-host publicly if you wanted to) whose purpose it is to facilitate P2P connections between your devices.

If you were outside your home network and wanted to connect to your server from your laptop, both devices would be connected to the TS server independently. When attempting to send IP packets between the devices, the initiating device (i.e. your laptop) would establish a direct wireguard tunnel to the receiving device. This process is managed by the individual devices while the central TS service merely facilitates communication between the devices for the purpose of establishing this connection.

Atemu@lemmy.ml · 2 months ago

If you’re worried about that, I can recommend a service like Tailscale which does not require permanently open ports to the outside world, offering quite a bit more security than an exposed traditional VPN server.

Atemu@lemmy.ml · 2 months ago

The writer will need to tag things down, to minimal details, for the sake of languages that they don’t care about.

Sure and that’s likely a good bit of work.

However, you must consider the alternative which is translating the entire text to dozens of languages and doing the same for any update done to said text. I’d assume that to be even more work by at least one order of magnitude.

Many languages are quite similar to another. An article written in the hypothetical abstract language and tuned on an abstract level to produce good results in German would likely produce good results in Dutch too and likely wouldn’t need much tweaking for good results in e.g. English. This has the potential to save ton of work.

This issue affects languages as a whole, and sometimes in ways that you can’t arbitrate through a fixed writing style because they convey meaning.

The point of the abstract language would be to convey the meaning without requiring a language-specific writing style. The language-specific writing style to convey the specified meaning would be up to the language-specific “renderers”.

(For example: if you don’t encode the social gender into the 3rd person pronouns, English breaks.)

That’s up to the English “renderer” to do. If it decides to use a pronoun for e.g. a subject that identifies as male, it’d use “he”. All the abstract language’s “sentence” would contain is the concept of a male-identifying subject. (It probably shouldn’t even encode the fact that a pronoun is used as usage of pronouns instead of nouns is also language-specific. Though I guess it could be an optional tag.)

Often there’s no such thing as the “default”. The example with pig/pork is one of those cases - if whoever is writing the article doesn’t account for the fact that English uses two concepts (pig vs. pork) for what Spanish uses one (cerdo = puerco etc.), and assumes the default (“pig”), you’ll end with stuff like *“pig consumption has increased” (i.e. “pork consumption has decreased”). And the abstraction layer has no way to know if the human is talking about some living animal or its flesh.

No, that’d simply be a mistake in building the abstract sentence. The concept of a pig was used rather than the concept of edible meat made from pig which would have been the correct subject to use in this sentence.

Mistakes like this will happen and I’d even consider them likely to happen but the cool thing here is that “pig consumption has increased”, while obviously slightly wrong, would still be quite comprehensible. That’s an insane advantage considering that this would apply to any language for which a generic “renderer” was implemented.

It ends like that story about a map so large that it represents the terrain accurately being as big as the terrain, thus useless.

As I said in the top, you’ll end with a “map” that is as large as the “terrain”, thus useless. (Or: spending way more effort explicitly describing all concepts that it’s simply easier to translate it by hand.)

I don’t see how that would necessarily be the case. Most sentences on Wikipedia are of descriptive nature and follow rather simple structures; only complicated further for the purpose of aiding text flow. Let’s take the first sentence of the Wikipedia article on Lemmy:

Lemmy is a free and open-source software for running self-hosted social news aggregation and discussion forums.

This could be represented in a hypothetical abstract sentence like this:

(explanation
 (proper-noun "lemmy")
 (software-facilitating
  :kind FOSS
  :purpose (purposes
            (apply-property 'self-hosted '(news-aggregation-platform discussion-forum)))))

(IDK why I chose lisp to represent this but it felt surprisingly natural.)

What this says is that this sentence explains the concept of lemmy by equating it with the concept of a software which facilitates the combination of multiple purposes.

A language-specific “renderer” such as the English one would then take this abstract representation and turn it into an English sentence:

The concept of an explanation of a thing would then be turned into an explanation sentence. Explanation sentences depend on what it is that is being explained. In this case, the subject is specifically marked as a proper noun which is usually explained using a structure like “<explained thing> is <explanation>”. (An explanation for a different type of word could use a different structure.) Because it’s a proper noun and at the beginning of a sentence, “Lemmy” would be capitalised.

Next the explanation part which is declared as a concept of being software of the kind FOSS facilitating some purpose. The combined concept of an object and its purpose is represented as “<object> for the purpose of <purpose>” in English. The object is FOSS here and specifically a software facilitating some purpose, so the English “renderer” can expand this into “free and open-source software for the purpose of facilitating <purpose>”.

The purpose given is the purpose of having multiple purposes and this concept simply combines multiple purposes into one.
The purposes are two objects to which a property has been applied. In English, the concept of applying a property is represented as as “a <property as adjective> <object>”, so in this case “a self-hosted news-aggregation platform” and “a self-hosted online discussion forum”. These purposes are then combined using the standard English method of combining multiple objects which is listing them: “a self-hosted news-aggregation platform and a self-hosted online discussion forum”. Because both purposes have the same adjective applied, the English “renderer” would likely make the stylistic choice of implicitly applying it to both which is permitted in English: “a self-hosted news-aggregation platform and online discussion forum”.

It would then be able to piece together this English sentence: “Lemmy is a free and open source software for the purposes of facilitating a self-hosted news-aggregation platform and online discussion forum.”.

You could be even more specific in the abstract sentence in order to get exactly the original sentence but this is also a perfectly valid sentence for explaining Lemmy in English. All just from declaring concepts in an abstract way and transforming that abstract representation into natural language text using static rules.

Atemu@lemmy.ml · 2 months ago

Yes, yes they will. If you’re the sole user, they’d identify you from your behaviour anyways.

I don’t think internet proxy won’t help very much w.r.t. privacy but it will make you a lot more susceptible to being blocked.

Atemu@lemmy.ml · 3 months ago

Is “Grouped Results” disabled in settings?

Atemu@lemmy.ml · 3 months ago

Certainly better than the U.S. in that regard but I wouldn’t consider Germany “resilient” either.

Atemu@lemmy.ml · 3 months ago

Sorry, can’t answer that as my crystal ball is broken at the moment.

Atemu@lemmy.ml · 3 months ago

I think it could be because Google may offer them quite a bit longer hardware support. They had to go with some industrial SoC for the FP5 to get Qualcomm to offer even a half decent hardware support cycle.

Atemu@lemmy.ml · edit-2 3 months ago

Whether this is bad depends on your threat model. Additionally, you must also consider that other search engines are able to easily identify you without you explicitly identifying yourself. If you can’t fool https://abrahamjuliot.github.io/creepjs/, you certainly can’t fool Google for instance. And that’s even ignoring the immense identifying potential of user behaviour.

Billing supports OpenNode AFAICT which I guess you could funnel your Moneros through but meh.

Edit: Phrasing.

Atemu@lemmy.ml · 3 months ago

I think you’re underestimating how huge of an undertaking a half-decent search index is, much less a good one.

Atemu@lemmy.ml · 3 months ago

I personally have not found Kagi’s default search results to be all that impressive

At their worst, they’re as bad as Google’s. For me however, this is a great improvement over using bing/Google proxies which would be the alternative.

maybe if I took the time to customize, I might feel differently.

That’s the killer feature IMHO.

Atemu@lemmy.ml · 3 months ago

Your search results look very different to mine:

Did you disable Grouped Results?

All the LLM-generated “top 10” listicles are grouped into one large block I can safely ignore. (I could hide them entirely but the visual grouping allows for easy mental filtering, so I haven’t bothered.) Your weird top10 fake site does not show up.

But yes, as the linked article says, Kagi is primarily a proxy for Google with some extra on top. This is, unfortunately, a feature as Google’s index still reigns supreme for general purpose search. It absolutely is bad and getting worse but sadly still the best you can get. Using only non-Google indices would just result in bad search results.
The Google-ness is somewhat mitigated by Kagi-exclusive features such as the LLM garbage grouping.

What Google also cannot do is highlighted in my screenshot: You can customise filtering and ranking.
The first search result is a Reddit thread with some decent discussion because I configured Kagi to prefer Reddit search results. In the case of household appliances, this doesn’t do a whole lot as I have not researched trusted/untrusted sources in this field yet but it’s very noticeable in fields like programming where I have manually ranked sites.

Kagi is not “all about” privacy. It’s a factor, sure but ultimately you still have to trust a U.S. company. Better than “trusting” a known abuser (Google, M$) but without an external audit, I wouldn’t put too much wight into this.
The index ain’t it either as it’s mostly Google though sometimes a bit better.
What really sets it apart is the features. Customised ranking aswell as blocking some sites outright (bye bye pinterest and userbenchmark) are immensely useful. So are filtering garbage results that Google still likes to return.

Atemu@lemmy.ml · 3 months ago

That whole situation was such an overblown idiotic mess. Kagi has always used indices from companies that do far more unethical things than committing the extreme crime of having a CEO who has stupid opinions on human rights.
I 100% agree with Vlad’s response to this whole thing and anyone who thinks otherwise should question what exactly it is they’re criticising.

I don’t like Brave (super shady IMHO) and certainly not their CEO but I didn’t sign up for a 100% ethically correct search engine, I signed up for a search engine with innovative features and good search results. The only viable alternatives are to use 100% not ethically correct search indices with meh (Google) to bad (Bing, DDG) search results. If you’re going to tell me how Google and M$ are somehow ethical, I’m going to have to laugh at you.

The whole argument amounts to whining about the status quo and bashing the one company that tries anything to change it. The only way to get away from the Google monopoly is alternative indices. Yes those alternatives may not be much more ethical than friggin Google. So what.

Atemu@lemmy.ml · 3 months ago

To the person receiving the money, it is worth it. Else they wouldn’t be doing it.

Atemu@lemmy.ml · 3 months ago

Yes and that’s precisely the point. You can make the decision not to pay and there are good reasons to do so (I do so too) but you must recognise that someone is still not getting paid for their work.

Atemu@lemmy.ml · 3 months ago

Cool story bro but you clearly still didn’t even read the first sentence of what I wrote.

Atemu@lemmy.ml · 3 months ago

What a great argument! You didn’t even read the first sentence…

It isn’t an ethical concern and hasn’t been since the 90s.

You’ll have to explain to me how not compensating someone for their work has been ethical since the 90s.

Atemu@lemmy.ml · 3 months ago

Security knowledge and ethical concerns are two separate things. Whether we like it or not, we pay online creators through private data we must give to entities who will use it against our best interests.

Atemu@lemmy.ml · 3 months ago

I do like the idea of using USB drives for storage, though…

I wholeheartedly don’t.

Atemu@lemmy.ml · 5 months ago

Actual: How to import data with proper readable payee?

Atemu@lemmy.ml · 10 months ago

This $250 Ryzen Pre-Built is a BEAST Home Server!

Atemu@lemmy.ml · 10 months ago

This $250 Ryzen Pre-Built is a BEAST Home Server!

Atemu@lemmy.ml · 1 year ago

How do you encode your paper scans?

Atemu@lemmy.ml · 1 year ago

How do you encode your paper scans?

Atemu@lemmy.ml · 1 year ago

How to debug broken compass?

Atemu